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The  design  of  advanced  technology  engines  is  often  limited  by  compressor 
blade  instability  or  flutter.  The  lack  of  adequate  design  tools  can 
produce  overly  conservative  designs  with  less  than  maximum  levels  of 
performance,  or  blade  flutter  can  result,  leading  to  a  complete 
destruction  of  an  engine.  Consequently,  the  accurate  prediction  of  the 
flutter  boundaries  is  a  key  requirement  for  the  successful  design  of 
future  engines. 

With  the  absence  of  adequate  theoretical  analyses,  partially  because  of 
the  lack  of  detailed  flow  information,  but  largely  because  of  the 
complexity  of  the  unsteady  flow  fields  present  during  stall  and  choke 
flutter,  it  Is  necessary  to  develop  valid  empirical  prediction  design 
systems  for  flutter-free  designs  based  on  representative  flutter  data. 

The  available  flutter  data  obtained  on  component  or  engine  development 
programs  were  however  very  limited.  These  data  provided  only  a  small 
window  in  both  operational  characteristics  and  necesary  data  required  to 
quantify  detail  aerodynamics  and  mechanical  parameters  In  the  flutter 
region.  In  view  of  these  limitations,  an  Air  Force  sponsored  program 
"Experimental  Analysis  of  Blade  Instability"  (Contract  F33615-76-C-2035) 
was  Initiated  to  widen  the  data  window  for  both  stall  and  choke  flutter. 
In  this  program,  numerous  tests  were  conducted  using  a  stationary  annular 
cascade.  A  systematic  variation  of  aerodynamic  and  structural  parameters 
was  made  to  provide  a  flutter  data  bank  for  both  flutter  regimes.  The 
primary  parameters  varied  Included  design  reduced  velocity,  solidity, 
relative  density,  leading  edge  mach  number,  and  Incidence  angle,  all  for 
front  stage  designs. 
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As  an  extension  to  this  program,  other  variables  such  as  camber,  stagger, 
and  frequency  tuning  were  investigated,  under  Company  sponsorship,  to 
determine  their  effects  on  the  stall  and  choke  flutter  boundaries.  The 
data  bank  generated  as  a  result  of  these  experimental  programs  consists 
of  several  thousand  data  points. 

An  empirical  prediction  design  system  for  the  onset  of  flutter  based  on 
regression  analysis,  was  attempted  as  part  of  the  Annular  Cascade 
program.  The  intent  was  to  parameterize  these  data  to  Identify  the 
significant  flutter  parameters  and  subsequently  to  determine  the 
appropriate  design  format,  requirements,  and  procedures.  The  prediction 
correlation  was  not  totally  successful.  Although  the  data  did  collapse 
using  the  developed  Standard  Day  Flutter  Parameter,  FPSD,  the 
distinction  between  the  flutter  and  stable  points  was  at  times 
ambiguous. 

The  results  of  this  regression  study  Indicated  that  either  additional 
variables  or  other  combinations  of  the  present  variables  need  to  be 
established  to  provide  adequate  separation  between  the  stable  and  flutter 
data  points.  Also  the  regression  study  showed  the  need  to  utilize 
geometric  methodology  that  would  directly  address  the  geometric  problem 
of  separating  the  stable  and  flutter  points,  and  the  various  types  of 
flutter  points. 

The  present  program  was  undertaken  to  develop  the  appropriate  geometric 
methodology  and  apply  it  to  the  Annular  Cascade  Data  Base.  The  objective 
was  to  predict  stall  and  choke  flutter  and  then  to  validate  the 
methodology  using  available  engine  data. 
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The  name  Is  an  acronym  for  Geometric  Analysis  of  Large  Arrays 
Containing  Three  or  more  Independent  Coordinates. 
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As  the  name  Indicates,  the  program's  capabilities  are  not  at  all 
specific  to  the  flutter  prediction  problem  while  all  its  capabilities 
were  intended  to  be  useful  for  that  problem,  but  are  much  more  generic. 

The  fundamental  point  of  view  underlying  GALACTIC  is  as  follows:  If, 
in  the  flutter  prediction  problem,  there  were  only  two  (possibly 
three)  independent  variables  -  for  example  temperature  and  pressure, 
the  test  points  would  be  plotted  versus  these  two  variables  and 
labeled  according  to  the  stability  condition  that  was  observed.  The 
points  of  the  same  stability  condition  would  then  be  enclosed  by  a 
curve  to  define  a  region  where  that  stability  condition  would  be 
expected  to  prevail  for  engine  data  not  in  the  data  base. 

The  Intent  of  GALACTIC  Is  simply  to  extend  to  higher  dimensional  space 
the  geometric  analysis  which  is  so  intuitive  and  straightforward  in  two 
or  possibly  three  dimensional  spaces. 

Some  of  the  geometric  tasks  that  the  human  eye  does  so  effortlessly  in 
two  dimensions  are  as  follows: 

o  Normalize  variables  to  comparable  scales 

o  Shape,  size  and  orientation  of  groupings  of  points  whether  grouped 
by  like  stability  condition  or  grouped  by  geometric  adjacency 

o  Extension  of  points  of  like  stability  condition  Into  continuous 
enclosing  regions  where  that  stability  condition  prevails 

o  Decision  lines  between  the  regions  by  which  to  predict  the 

stability  condition  of  a  new  point  whose  stability  condition  was 
unknown. 

It  Is  a  simple  yet  accurate  statement  to  say  that  GALACTIC  and  the 
methodology  it  embodies  Intends  to  translate  these  geometric  tasks 
into  algebraic  tasks  and  then  to  extend  these  to  higher  dimensions. 

The  programming  aspects  will  be  described  first,  and  then  the 
underlying  mathematical  technology  it  embodies. 
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GALACTIC  is  written  in  FORTRAN  77.  Its  development  war  '■arried  out 
mostly  on  the  Honeywell  0PS92  computer,  but  it  was  subsequently 
transferred  to  the  VAX4  computer  where  it  is  now  running.  Its  listing 
currently  is  4875  lines  long,  of  which  2661  lines  are  in  the  main 
program  and  the  remaining  2214  lines  are  in  subroutines. 

Among  the  subroutines,  twelve  are  from  the  Honeywell  subroutine  library 
supplied  by  the  International  Mathematical  and  Statistical  Libraries; 
these  routines  were  part  of  GALACTIC  as  it  was  being  developed  on  the 
Honeywell  computer  and  were  subsequently  transferred  to  the  V AX4  when 
GALACTIC  as  a  whole  was  moved  to  that  computer.  These  routines  are; 
ZX1LP,  ZX3LP,  LSVDB,  LSVDF ,  LSVG1 ,  LSVG2,  UERSET,  UERTST,  UGET10, 

USPKD,  VHS12,  SROTG.  They  occupy  1142  lines  of  the  listing. 

The  program  can  be  run  In  batch  or  time  sharing  mode;  it  will  detect 
which  is  being  used.  If  in  the  batch  mode,  it  must  be  supplied  with  a 
file  containing  input  in  the  order  in  which  it  is  needed. 

The  program  input  will  be  described  first,  then  the  output,  followed  by 
the  logical  sequence  of  the  computation.  The  mathematical  methodology 
is  presented  in  the  succeeding  section.  Actually,  these  four  parts  are 
so  interdependent  that  they  must  be  read  in  coordination  with  one 
another. 
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The  program  Input  will  be  described  in  the  time  sharing  mode  with  an 
occasional  comment  concerning  batch  usage.  Generally,  there  is  no 
distinction  between  the  two;  the  program  in  one  case  reads  the  time 
sharing  terminal  and  in  the  other  case  it  reads  a  data  file. 

To  facilitate  the  exact  definition  of  the  input,  the  read  statement 
will  be  located  in  the  program  (listing  as  of  16:31  on  July  20,  1987); 
for  example,  S580-2  means  2  lines  prior  to  statement  labeled  580. 

All  input  read  statements  are  preceded  by  prior  prompts  explaining  what 
input  is  needed  by  the  read  statement.  The  prompts  will  be  shown  here, 
followed  by  any  needed  explanation.  The  statement  number  refers  to  the 
read  statement,  not  the  prompt. 

SI 55+1  "Enter  number  and  names  of  sample  files,  words/record,  Y/N  to 
clearfl les."  Each  record  on  an  input  file  contains  the  data 
for  a  test  point  from  the  Annular  Cascade  Data  Base.  The 
names  should  be  In  single  quotes  for  the  VAX4.  Clearfiles 
allow  usage  of  temporary  data  files  left  over  from  a  prior 
run.  * Y '  or  1 N '  must  also  be  in  single  quotes. 

S250+1  "Enter  N  (LE  14),  then  name  N  variables."  These  identify  the 
words  from  each  test  point  record  which  will  be  used  to 
describe  a  test  point.  The  words  are  named  by  their  numerical 
order. 

S250+5  "Which  other  variable  identifies  subsets  (0  if  none)."  This 
is  the  number  of  the  word  which  contains  the  Identification  of 
the  set  to  which  the  point  belongs.  If  the  set  is  unknown, 
the  Identification  code  "99"  is  used.  The  omission  of  a  set 
Identification  variable,  while  possible  In  early  versions  of 
the  program,  may  not  lead  to  valid  results  in  the  current 
program,  and  should  be  avoided. 


S290+1 


Enter  N  (LE  14),  then  ID  of  N  sets  to  be  excluded. 


S295+4  "Enter  N  (LE  20),  then  name  N  points  to  be  excluded." 

S295+10  "Enter  N,  then  N  pairs  of  ID's  to  be  labeled  as  the  first." 

The  test  points  from  the  data  base  are  identified  by  the 
numerical  order  in  which  they  are  read  in  by  the  program, 
including  test  points  read  in  from  any  prior  files.  If  "a 
pair  of  ID'S  is  labeled  as  the  first,"  any  points  bearing  the 
second  ID  will  be  altered  so  that  they  bear  the  first  ID. 

Thus,  the  two  sets  are  combined  under  the  label  of  the  first 
set. 

S470+1  "Choose  typical  scale  options  (MIN,  MAX,  AVER,  SIGMA,  MID 
RANGE,  SPECIAL."  The  typical  value  is  subtracted  from  each 
test  point  variable,  and  the  result  is  divided  by  the  scale 
value.  SIGMA  means  standard  deviation  of  the  sample.  MID 
means  average  of  the  MAX  and  MIN  values.  The  options  listed, 
except  for  SPECIAL,  call  upon  the  program  to  use  the  values  it 
has  computed  from  the  test  points  read  in.  There  are  six 
other  options:  MNU,  MXU,  AVU,  SGU,  MDU,  RNU  which  are  like 
the  first  six  except  that  they  were  earlier  computed  from  all 
the  files  in  the  sample  (891  data  points)  from  the  Data  Base 
and  are  contained  in  S110+4  to  S120-2.  SPECIAL  allows  the 
user  to  enter  typical  and  scale  values  for  the  NV  variables 
being  used.  In  this  case,  there  will  be  two  additional 
prompts: 

S540+1  "Enter  typical  values  for  the  NV  variables." 

S570+1  "Enter  scale  values  for  the  NV  variables." 

S580+3  "Enter  count  of  bonded  points  (I  &  J  forced  into  same 
cluster." 


S600 


"Enter  the  NBOND  pairs  of  point  numbers."  The  first  read 
acquires  NBOND,  which  Is  used  In  the  second  read.  If  point 
number  I  and  point  number  J  are  found  to  be  In  different 
clusters,  the  two  clusters  will  be  combined. 

S630+5  "Enter  0  or  1  to  notwrite,  or  write  following  output 
options." 

"For  all  data:  SMN,  NTR,  NPT,  BAX,  BTR,  BPT . " 

S630+9  "For  setdata:  IPI,  MEM,  DSJ,  HPL,  BAX,  PRJ,  BTR,  CGR,  COR, 
EST,  HPP,  HPQ. " 

S630+13  "For  clustrs:  EDS,  SDS,  STP,  FPT,  MEM,  CCD,  BAX,  PRJ,  BTR, 
CGR,  COR,  SUR." 

The  input  for  the  first  read  is  a  six-digit  word,  and  for  the 
second  and  third  read,  it  is  a  twelve-digit  word,  the  digits 
being  0  or  1 .  The  specific  output  options  will  be  described 
in  the  next  section  on  Program  Output.  It  is  important  to 
note  that  calculations  are  omitted  within  GALACTIC  if  no 
output  derived  therefrom  is  requested.  Thus,  the  running  time 
is  heavily  dependent  on  these  three  output  options. 

S3000+3  "Name  Revised  Sample  File  &  Population  Files  (in  sample 
format)  ("  "  if  none)." 

The  files  read  in  originally  at  S155+1  were  the  original 
sample  files.  The  Revised  Sample  file  is  intended  to  be  used 
in  the  same  way  in  subsequent  runs  of  GALACTIC.  It  attempts 
to  extract  from  the  original  sample  files  and  the  Population 
files  the  critical  test  points,  that  is  the  test  points  that 
would  be  needed  in  developing  revised  hyperplanes.  The 
Population  files  are  in  the  same  format  as  the  original  and 
Revised  Sample  files  and  have  test  points  in  the  same 
stability  regions,  but  their  points  were  not  used  in 
constructing  the  hyperplanes.  The  underlying  thought  is  that 
if  there  are  more  test  points  available  than  can  be  analyzed 
simultaneously,  it  is  possible  to  pick  a  sample,  develop 
hyperplanes  for  the  sample,  test  these  against  the  larger 
population,  and  revise  the  sample  to  exclude  the  population 
points  which  were  found  to  be  redundant. 
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In  this  way,  we  hope  that  we  can  find  a  sample  of  critical 
points  which  will  generate  hyperplanes  which  will  be  satisfied 
by  the  entire  population.  This  concept  was  developed  due  to 
memory  restrictions  on  the  Honeywell  DPS92,  but  has  not  proven 
necessary  on  the  VAX4  due  to  the  latter's  virtual  memory 
system. 

S3059+1  "Enter  N  (LE  14),  then  ID  of  N  sets  to  be  excluded." 

S3059+4  "Enter  N  (LE  20),  then  name  N  population  points  to  be 
excluded. " 

S3054+1  "Enter  N,  then  N  pairs  of  ID's  to  be  labeled  as  the  first." 

This  input  has  the  same  meaning  as  the  input  at  S290+1 ,  S295+4 
and  S295+10. 

S3050+1  "Revised  sample  file,  FILINR,  already  exists." 

"Enter  1  to  overwrite  or  2  to  enter  new  name." 

"Enter  name  of  revised  sample  file." 

In  case  the  output  file  FILINR  already  exists,  the  time  sharing 
user  may  overwrite  or  supply  a  new  name. 

S3495+4  "Enter  name  of  next  population  file  ("  "if  none)." 

S3491+2  "Enter  N  (LE  14),  then  ID  of  N  sets  to  be  excluded." 

S3491+5  "Enter  N  (LE  20),  then  name  N  population  points  to  be 

excluded."  This  input  has  same  meaning  as  preceding  input. 

S4100+2  "Name  input,  output  hyperplane  file  ("  "  to  omit)."  Enter, 

In  single  quotes,  the  name  of  the  already  existing  file  FILEHI 
of  hyperplanes,  as  well  as  the  designated  name  for  FILEHO,  the 
hyperplane  file  to  be  produced.  Each  record  on  these  files 
describes  a  single  hyperplane  and  consists  of  four  more  words 
than  the  number  specified  in  SI 55+1 .  The  first  two  words 
contain  the  identification  for  the  two  sets,  the  second  two 
words  contain  the  constant  term  of  the  hyperplane  equation  for 
each  of  the  two  sets  and  the  remaining  words  are  coefficients 
for  each  word  in  the  test  point  records.  In  case  FILEHO  is  an 
already  existing  file,  the  usual  overwrite  or  name  revision  is 
provided  for. 
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S4149 


9010+3 

9020+1 


S9140+2 


S9170+1 


"Choose  to  save  old,  new  planes  (0,  N)  for  sets  _ ,  _ 

The  user  specified  Old  or  New.  This  choice  is  made  on  the 
basis  of  output  available  earlier  in  the  current  run,  as 
compared  with  output  available  in  the  prior  runs  that  produced 
the  old  hyperplane  equation. 

Enter  0,  N  notto,  to  transform  data  per  cluster  N  axes,  N  L£  0 
per  all  data." 

"Enter  max  number  of  axes  to  be  used." 

This  causes  the  data  originally  read  in  accord  with  SI 55+1  to 
be  read  again  and  then  transformed  from  the  original  variables 
into  new  variables  defined  as  the  leading  axes  of  one  of  the 
clusters,  or  of  the  entirety  of  the  input  data,  i.e.  the 
alldata  axes.  This  option  requires  that  the  test  points  be 
similarly  transformed  any  time  they  are  read  afresh  from  the 
Data  Base  files.  Since  this  is  not  currently  done,  the  option 
is  safe  to  use  only  if  the  calculations  required  by  the  user 
do  not  entail  rereading  the  test  point  data.  This,  for 
example,  is  the  case  if  only  cluster  analysis  is  required. 
(This  restriction  on  the  use  of  this  option  could  be  easily 
removed  If  the  need  justified  doing  so.) 

"Enter  0,  1  ifnot,  If  above  cluster  CG's  to  be  treated  as 
points."  This  option  occurs  only  If  the  prior  option  was  not 
exercised.  It  is  subject  to  the  same  restriction.  The  idea 
of  the  option  is  to  condense  the  clusters  Into  single  points 
situated  at  their  center  of  gravity,  and  to  reanalyze  the 
problem  In  this  simplified  form,  finding,  for  example,  new 
clusters  (or  galaxies)  made  up  of  the  old  clusters  -  much  in 
the  spirit  of  classical  mechanics.  This  is,  therefore, 
allowed  to  remain  despite  the  restriction  on  its  use,  as  a 
vestige  of  a  possible  future  activation. 

"Enter  0,1,2  as  there  arent,  are  more  cases,  change  case." 

If  no  more  cases,  the  program  stops  at  S9180.  If  there  are 
more  cases,  the  program  either  goes  to  the  beginning  SI 30  or 
to  the  input  on  bonding  and  output  codes  S580  saving  the  user 
the  burden  of  repeating  all  the  input. 
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The  major  output  Is  the  updated  set  of  pairs  of  hyperplanes,  which  is 
written  out  on  a  data  file.  The  format  consists  of  the  stability  code 
for  the  first  set,  the  stability  code  for  the  second  set,  the  left-hand 
side  constant  for  the  first  set,  the  left-hand  side  constant  for  the 
second  set,  followed  by  a  coefficient  for  each  of  the  words  in  the 
format  of  the  test  points  supplied  from  the  Angular  Cascade  Data  Base. 


Next  we  should  note  that  the  program  input  is  printed  out  when  the 
program  operates  In  batch  mode,  so  that  the  hard  copy  output  would 
Include  prompts  and  responses  in  very  much  the  same  way  as  if  the 
program  were  run  in  time  sharing  mode. 


The  following  description  of  output  will  consist  in  an  explanation  of 
the  thirty  0  or  1  digits  In  the  three  output  code  words  that  control 
program  printout.  Each  one  of  these  digits  has  a  three-letter 
abbreviated  description  Intended  as  a  memory  aid  to  the  user.  Digits 
are  counted  from  the  left  of  the  code  word;  1  calls  for  printout  and  0 
suppresses  it.  The  first  code  word  IOUTPUT  refers  to  preliminary 
calculations  and  has  six  digits.  The  second  code  word  IOUTPS  refers  to 
the  set  discrimination  calculations  and  has  twelve  digits.  The  third 
code  word  IOUTPC  refers  to  cluster  analysis  and  has  twelve  digits  also. 


(Digit  1  of  IOUTPUT)  Prints  out  for  each  set,  the  set  number, 
the  set  code  and  the  number  of  points.  Also  prints  a  summary  of 
each  variable  showing  minimum,  midvalue,  maximum,  range, 
average,  and  sample  standard  deviation. 

(Digit  2  of  IOUTPUT)  Prints  the  transformation  from  the  original 
variables  to  the  variables  normalized  by  a  typical  value  and 
scale  factor.  Also,  the  reverse  transformation. 

(Digit  3  of  IOUTPUT)  Prints  the  data  points  normalized  by  a 
typical  value  and  a  scale  factor. 
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(Digit  4  of  IOUTPUT)  Prints  for  the  entirety  of  the  test 
points:  the  center  of  gravity,  the  semiaxes  of  bottle 
containing  the  data,  the  length  of  the  semiaxes. 

(Digit  5  of  IOUTPUT)  Prints  transformation  to  and  from  the  axes 
for  the  entirety  of  the  test  points. 

(Digit  6  of  IOUTPUT)  Prints  data  points  transformed  to  the  axes 
for  the  entirety  of  the  test  points. 


(Digit  10  of  IOUTPS)  Prints  out  a  matrix  of  Euclidean  distance 
between  pairs  of  sets,  that  is,  the  minimum  Euclidean  distance 
between  pairs  of  points,  one  in  one  set,  one  in  the  other  set. 
Also  prints  the  maximum  distance  between  points  in  the  same 
set.  In  addition,  the  points  providing  these  distances  are 
shown  (both  if  between  sets,  one  if  within  sets). 

(Digit  1  of  IOUTPS)  Prints  out  points  of  each  set  which  are 
vertices  due  to  their  having  a  variable  with  a  maximum  or 
minimum  value.  Also,  the  points  found  to  be  redundant  are 
expressed  as  convex  linear  combinations  of  the  vertices. 

(Digit  2  of  IOUTPS)  Prints  for  each  point  the  number  of  the  set 
to  which  it  belongs.  A  minus  sign  is  used  to  indicate  that  the 
point  has  been  found  to  be  redundant. 

(Digit  3  of  IOUTPS)  Prints  whether  the  convex  hulls  of  pairs  of 
sets  are  disjoint.  Disjointness  is  sought  using  only  the  first 
alldata  axis,  then  the  first  two,  etc.,  using  no  more  than  are 
necessary.  Each  step  is  reported  on.  If  the  pair  of  sets  is 
not  disjoint,  then  an  intersection  point  is  exhibited  as  a 
convex  linear  combination  of  the  vertices  of  each  set. 

(Digit  4  of  IOUTPS)  Prints  out  hyperplanes  to  discriminate 
between  each  pair  of  sets.  The  hyperplanes  are  developed  using 
as  few  axes  as  possible,  beginning  with  the  number  of  axes  found 
necessary  above  (see  DSO).  If  the  sets  were  previously  judged 
to  be  disjoint,  then  the  hyperplanes  are  sought  with  a  gap; 
otherwise,  the  overlap  option  is  used.  If  hyperplanes  cannot  be 
gotten  successfully,  then  the  opposite  case  (overlap  or  gap)  is 
tried  and  the  result  is  accepted.  The  success  at  each  step  of 


this  process  is  reported.  When  this  is  completed  for  all  pairs 
of  sets,  the  distance  to  each  test  point  along  the  normal  to 
each  pair  of  hyperplanes  is  computed  and  marked  with  E  in  case 
the  point  violates  a  hyperplane.  This  is  printed  as  well  as  a 
tally  of  the  E's.  The  planes  may  be  adjusted  to  remove  some 
small  violations  and  this  too  is  reported. 

(Digit  11  of  IOUTPS)  Prints  for  each  test  point  on  a  Population 
file,  the  distance  to  the  point  along  the  normal  to  each  pair  of 
hyperplanes,  with  ,  ,  or  -  to  indicate  that  the  point  is 

beyond  both  planes,  falls  short  of  both  planes,  or  is  between 
both  planes.  Once  this  is  completed  a  tally  is  printed  showing 
the  number  of  points  on  the  Revised  Sample  File,  the  number  of 
these  which  were  from  the  original  or  input  sample  files  and 
non-redundant,  the  balance  being  from  the  Population  files  with 
identification  code  not  in  the  input  sample  file  or 
misclassifled  by  hyperplanes.  Among  those  misclassified  by  the 
hyperplanes  are  various  subcategories.  The  subcategories 
Indicate  whether  by  adjusting  gaps  between  the  planes  the 
violation  could  be  removed  and,  if  so,  whether  not  only  the 
correct  stability  set  but  also  an  incorrect  stability  set  might 
be  Indicated.  The  same  subcategories  can  apply  when  only 
incorrect  stability  sets  are  Indicated.  (These  types  of 
category  are  examined  more  completely  in  the  EFAGHY  program.) 

Optional  Output  if  computed:  Prints  hyperplanes  in  terms  of  the 
original  input  variables;  the  prior  printout  showed  the 
hyperplanes  in  terms  of  the  original  variables  transformed  by 
subtraction  of  a  typical  value,  divided  by  a  scale  value,  then 
with  center  of  gravity  subtracted  and  converted  to  all  data 
axes.  The  conversion  of  the  hyperplanes  to  original  variables 
facilitates  their  application  to  a  large  number  of  points 
expressed  in  terms  of  the  original  variables. 


HPO  (Digit  12  of  IOUTPS)  Prints  for  each  test  point  on  original 

sample  files,  the  distance  to  the  point  along  the  normal  to  each 
pair  of  hyperplanes.  This  is  calculated  using  the  test  points  in 
the  original  variables  and  the  hyperplanes  in  the  original 
variables.  Thus,  this  output  is  a  check  on  prior  output,  HPL, 
which  it  should  duplicate. 


EDS  (Digit  1  of  IOUTPC)  Prints  matrix  of  Euclidean  distances  between 
each  pair  of  normalized  points. 

SDS  (Digit  2  of  IOUTPC)  Prints  matrix  of  Stepping  Stone  distance 
between  each  pair  of  normalized  points. 

STP  (Digit  3  of  IOUTPC)  Prints  matrix  of  count  of  steps  between  each 
pair  of  normalized  points. 

FPT  (Digit  4  of  IOUTPC)  Prints  matrix  of  frontier  points  in  going 

from  one  point  to  any  other  point.  The  ij  entry  is  the  frontier 
point  encountered  first  in  going  from  point  i  to  point  j. 

MEM  (Digit  5  of  IOUTPC)  Prints  for  each  test  point  the  number  of  the 
cluster  to  which  the  point  belongs,  with  a  minus  sign  to  denote 
an  Interior  point.  Also  prints  the  count  of  set  members  in  each 
cluster. 

CCD  (Digit  6  of  IOUTPC)  Prints  the  Stepping  Stone  distance  between 
clusters  as  well  as  the  Stepping  Stone  diameter  of  each  cluster, 
that  is,  the  maximum  Stepping  Stone  distance  between  pairs  of 
points  which  both  belong  to  the  cluster. 

SUR  (Digit  12  of  IOUTPC)  Prints  out  the  fit  of  a  quadric  surface  to  a 
cluster.  This  option  should  not  be  exercised  since  the  validity 
of  the  output  Is  questionable.  Clarifying  this  matter  has  not 
been  done  since  it  has  not  had  high  priority  and  because  the 
shape  and  size  of  a  cluster  is  adequately  described  by  the 
calculation  of  cluster  axes.  The  present  option  has,  however, 
been  retained,  since  it  is  worth  validating. 
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(The  generic  term  '‘group"  win  be  used  for  "set"  or  "cluster".) 


(Digit  5  of  IOUTPS,  Digit  7  of  IOUTPC)  Prints  semiaxes  of  each 
group,  and  the  length  of  each  semi  axis. 

(Digit  6  of  IOUTPS,  Digit  8  of  IOUTPC)  Plots  for  each  group  the 
projection  of  its  points  on  each  axis.  If  the  points  are 
represented  with  coordinates  on  each  axis,  the  projection  is 
simply  the  plot  of  the  coordinates  for  each  axis.  This  provides 
some  visual  grasp  of  how  the  points  are  concentrated  in  the 
group.  The  plots  are  printer  plots  and  quite  adequate  for  their 
purpose. 

(Digit  7  of  IOUTPS,  Digit  9  of  IOUTPC)  Prints  transformation  to 
and  from  bottle  axes. 

(Digit  8  of  IOUTPS,  Digit  10  of  IOUTPC)  Prints  center  of  gravity 
of  each  group.  If  the  group  is  a  cluster,  then  there  is  printout 
as  to  where  the  center  of  gravity  is  located  relative  to  its 
cluster. 

(Digit  9  of  IOUTPS,  Digit  11  of  IOUTPC)  Prints  correlation  matrix 
for  each  group.  Also  prints  the  volume  of  the  points  in  the 
group  as  reflected  by  the  determinant,  labeled  the  CVOLUME,  as 
well  as  the  volume  of  the  bottle  containing  the  points.  This 
output  is  not  valid,  but  its  correction  has  been  deferred  in 
favor  of  higher  priority  items.  The  option  has  not  been 
suppressed  since  it  is  conceptually  worthwhile  and  deserves 
completion. 


The  logical  flow  in  GALACTIC  is  extremely  simple,  going  directly  from 
beginning  to  end  with  no  major  loops;  the  only  deviation  is  a  section 
of  code  in  the  form  of  an  open  subroutine  which  is  executed  once  for 
sets  from  early  in  the  code  and  is  executed  again  for  clusters  in  the 
sequence  in  which  it  is  located. 

Accordingly,  the  artificial  format  of  block  diagrams  will  not  be 
needed.  Each  section  of  code  will  be  denoted  using  statement  numbers, 
since  the  VAX  computers  do  not  use  labels  for  lines.  The  beginning  or 
end  of  a  section  of  code  will  be  denoted  for  example  as  S580-2  meaning 
the  second  line  preceding  statement  labelled  580.  The  statements  are 
labelled  sequentially  with  few  exceptions. 
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beginning  to  S110+1 
S1 10+3 


SI  10+4  to  SI 20-1 


SI 20  to  S210-1 


S210  to  S300-11 


S300-10  to  S360+2 


S360+3  to  S450-1 


S450  to  S580 


S580+1  to  S630-1 
S630  to  6640-1 


S640  to  S740+1 


S750  to  S809-1 


S809  to  S812+1 


Data  Organization 

Detect  whether  batch  or  time  sharing  mode 

Summary  data  for  891  Annular  Cascade  Test  Points,  16 
variables,  min,  max,  average,  std  dev,  mid  point,  range 

Print  title,  date 

Read  names  of  data  files  from  Annular  Cascade  Data  Base 
Open  temporary  data  files:  10,  11,  12,  13,  14,  15,  22, 
23,  24,  25 

Read  which  variables  to  be  included,  any  sets  or 
points  to  be  excluded 

Read  data  files  01  into  XINF  tables,  named  variables 
only 

Accumulate  in  VARMOM  min,  max,  sum,  sum  of  squares  for 
each  variable 

Count  sets  in  ISETNAM,  points/set  in  NPINSET 
Set  Id  per  point  In  IPINSET 

Print  points/set  and  set  id 
Change  id  to  set  number  in  IPINSET 

Choose  option  for  typical,  scale  value  for  each 
variable 

Read  which  clusters  are  bonded,  NBOND,  I  BOND 

Read  output  codes  into  IOUTPUT,  IOUTPS,  IOUTPC 
If  required,  go  back  to  do  S495-S510+2  to  print 
summary  data  for  each  variable 

Transform  XINF  data  using  typical  and  scale  values, 
print,  save  new  XINF  on  file  10 

Calculate  EDIST  matrix  of  Euclidean  distance  between 

pairs  of  points 

Save  on  files  11  and  12 

Initialize  file  13  to  matrix  of  l's  with  0  on  diagonal 
Point  min  distance  between  sets,  max  distance  between 
points  and  within  sets 

Call  VERTEX  to  decide  which  points  have  a  max  or  min 
coordinate 


Wi'.V.HWM 


S81 2  to  S790+1 


Set  parameters  for  open  subroutine  (S707E  to  $7610+5) 
to  do  basic  geometric  analysis  by  set 
Initialize  "group"  to  mean  "set" 


S7075  to  S7090+1 
S7090  to  S7130 
S7130+1  to  S7160 
$7160+1  to  S7230-1 


S7230  to  S7240-1 
S7240  to  S7320 
S7330  to  $7340 
S7340+1  to  S7350+1 
S7360 

S7360+1  to  S7460+1 


S7465  to  S7550 


S7550+1  to  S7610 


S7610+1  to  S7610+6 


S970  to  S 990-1 


S990  to  S9105-1 


(open  subroutine;  "group"  -  "sets"  or  "clusters") 


Initialization 

Read  data  from  file  10  Into  XINF,  rearranging  by  group 

Find  CG  for  each  group,  subtract  from  data  in  XINF 

Call  BOTTLE  to  find  direction  and  length  of  axes 

Count  of  non  zero  axes  into  NAINCL 

Print  diagnostic  If  dot  product  of  axes  exceeds  10~5 

Projection  of  points  In  group  on  each  axis 

Transformation  to  and  from  axes 

Go  back  to  7130+1  for  next  group,  If  any 

Print  out  CG  of  each  group 

If  "group"  means  "set,"  skip  to  S465 

Read  In  test  points  from  file  01,  adjust  for  typical 
and  scale  values 

Determine  If  CG  within,  outside,  toward  a  cluster 

Compute  correlation  matrix  VAR  for  points  represented 
by  coordinates  on  bottle  axes 

Compute  determinant  of  VAR  matrix,  and  CVOLUME  of  group 
Compute  BVOLUHE,  l.e.  volume  of  box  with  same  axes 

If  "group"  means  "set,"  go  to  S970 
If  "group"  means  "cluster,"  go  to  S7615 


Use  only  axes  at  least  5%  as  long  us  first  axis 
Call  VERTX  to  find  vertices  for  each  set 
Restore  XINF  from  file  10 

Initialize  axes,  axis  length  and  set  CG  to  0 
Subtract  CG  from  XINF  data 

Call  BOTTLE  for  alldata  axes  (i.e.  Irrespective  of  set) 
Print  axes,  length,  eg,  transformation  to  and  from 
alldata  axes,  test  points  In  terms  of  alldata  axes 


S990  to  S9105-1 


Initialize  axes,  axis  length  and  set  CG  to  0 
Subtract  CG  from  XINF  data 

Call  BOTTLE  for  alldata  axes  (i.e.  irrespective  of  set) 
Print  axes,  length,  eg,  transformation  to  and  from 
alldata  axes,  test  points  in  terms  of  alldata  axes 

S930  to  S2080+1  Sort  both  sets  of  vertices,  combine 

S2085  to  S2246-1  Determine  if  points  not  yet  found  to  be  vertices  are 

really  vertices  or  are  redundant 

Exhibit  redundant  points  as  linear  convex  combination 

of  vertices 

S2246  to  $2250+1  Print  for  each  point  its  set  number,  with  minus  for 

redundancies 

S2260  to  S2340  Call  INSECT  to  determine  discriminant  feasibility 

(first  method),  that  is  if  sets  are  disjoint. 

Begin  with  1st  alldata  axis,  including  successive  axes 
until  disjointness  occurs  or  axes  exhausted. 

If  not  disjoint,  using  all  axes,  exhibit  a  point  as 
convex  linear  combination  of  points  in  each  set. 

Do  this  for  all  pairs  of  sets. 

S2370  to  S2390+2  Copy  XINF  in  all  data  axes  into  file  22 

S2400  to  S2447+2  Call  DISCRM  to  determine  pair  of  hyperplanes  by 

Relaxed  Discriminant  method  (3rd  method) 

Begin  with  the  number  of  all  data  axes  found  by  INSECT 
to  suffice  for  disjointness 

If  DISCRM  finds  these  do  not  suffice,  more  axes  are 
Included 

If  INSECT  found  sets  are  disjoint,  DISCRM  begins  with 
gap  case 

If  INSECT  found  sets  are  not  disjoint,  DISCRM  begins 
with  overlap  case 

If  DISCRM  fails.  It  goes  to  opposite  case  using  all 
axes  and  accepts  answer 

S2447+3  to  S2513+1  Read  test  points  In  alldata  axes  from  file  22 

Calculate  for  each  point  its  distance  orthogonal  to 
each  pair  of  hyperplanes 

Mark  with  E  or  B  if  stability  condition  of  point  is 
known  and  point  is  on  wrong  side  of  a  pair  of  planes 
or  In  band 

S2517  to  S2515+1  Adjust  hyperplane  to  eliminate  borderline  violations 

Hyperplane  orientation  not  changed 


S2520  to  S2510+1 


S3000  to  S3064-1 

S3064  to  S3080-1 

S3080 

S3080+1  to  S3320 

S3320+1  to  S3470+1 

S3470+2  to  S3494+1 


S3495  to  S3499-1 

S3499 

S3499  to  S3498-1 
S3498  to  S3493+2 


Print  set  number  for  each  test  point,  minus  If 
redundant 

If  point  Is  far  from  planes,  on  correct  side,  it  is 
redundant 

Open  Revised  Sample  file  FILINR  as  file  03 

Read  name  of  Population  file  FILINT 

Read  Id  of  sets  or  points  to  be  excluded  or  sets  to  be 

combined 

Invent  new  name  if  FILINR  already  exists 
Skip  if  FILINR  is  blank 

Read  test  points  from  01,  copy  onto  03  unless  excluded 
or  redundant 

Set  ISETNAMP  equal  ISETNAM 

Skip  to  S3499  if  FILINT  is  blank 
Open  FILINT  as  file  02 
Read  point  from  FILINT 

If  point  Is  not  excluded  and  has  a  set  not  treated  in 
preceding  discriminant  analysis,  copy  it  onto  FILINR 
Convert  point  to  all  data  axes 

Apply  hyperplane  equations  to  point 
Determine  for  each  pair  if  error,  if  distance  to  point 
along  orthogonal  to  planes  is  >  ,<  or  between  («)  to 
constants  positioning  the  pair  of  planes 

If  set  membership  of  point  is  known  and  a  hyperplane 
Is  violated,  write  point  onto  file  03,  tallying  type 
of  error 

Print  distance  to  point  in  direction  orthogonal  to 

planes  with  mark  >  , <  ,  - 

Do  for  all  points  on  FILINT,  file  02 

Read  In  name  of  next  population  file  with  new 
exclusions  possible 

Redo  calculations,  going  back  to  S3100 

If  no  more  population  files,  FILINT,  rewind  03 

Print  out  E  totals,  if  any 

Print  tally  of  types  of  error,  cases  of  false 
Identification,  number  and  source  of  points  on  revised 
sample  file  03 
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S4000  to  S4060 


S4060+1  to  S4067+2 


S4067+3  to  S4121-6 


S4121-5  to  S4200+5 


S6000  to  S6030-1 
S6030  to  S6200 


S6200+1 

S6200+2  to  S6210-1 

S6210  to  S6360+1 

S6370-7  to  S6550+1 


Transform  hyperplane  equations  to  original  input 
variables 

Apply  transformed  hyperplanes  to  test  points  on  file 
01,  compute  distance  to  point,  orthogonal  to  plane  as 
check  against  same  calculation  in  alldata  axes 

Read  In  names  FILEHI  and  FILEHO  of  old  and  new 
hyperplane  files 

Open  these  files  as  04  and  07,  create  new  name  if 
necessary  for  FILEHO 

Read  in  FILEHI 

Copy  hyperplane  from  FILEHI  onto  FILEHO  if  no  new 
hyperplane  was  developed  for  the  same  pair  of  sets,  or 
if  old  hyperplanes  same  as  new 

Otherwise  ask  user  if  In  time  sharing;  use  old  if  in 
batch 

Rewind  files  and  close 


Initialization 

Calculate  SDIST  matrix  of  Stepping  Stone  distance 
between  each  pair  of  points,  IDIST  matrix  of  number  of 
steps  between  pairs  of  points  and  I  FRONT  matrix  of 
pair  of  frontier  points  in  going  from  one  point  to  any 
other  point 

This  makes  one  sweep  through  the  data 

Number  of  changes  are  counted  in  ICHANGE2 ,  number  of 

sweeps  Is  counted  in  ICHANGE1 

If  ICHANGE2  is  0,  go  to  S6220 

Up  I CHANG El  by  1 

Print  out  ICHANGE2  and  computer  cost  for  sweep 
User  decides  whether  to  continue;  if  yes,  go  to  S6030 

Delete  temporary  files  no  longer  needed  (12,  13,  14, 
15)  or  (22,  23,  24,  25) 

Print  SDIST,  IDIST,  I FRONT 

Identify  clusters 

ICLUSTER  has  point  numbers  (columns)  for  each  cluster 
(row) 


m 

m 

S6550+2  to  S6730+1 


Find  subclusters 


S6730+2  to  S6880+3 

S6890  to  S6950 

S6950+1  to  S7020 
S7020+1  to  S7060 

S7061  to  S7075-1 

S7615  to  S7720 

Termination  Procedure 
S 9000-2  to  S9010 
S 9010+1  to  S9130+1 

S9140  to  S9170-1 


Combine  clusters  that  are  bonded 
Combine  if  cluster  has  only  one  point 

Calculate  and  print  IPINCL,  showing  for  each  point  the 
number  of  its  cluster 

Count  points  In  each  cluster,  by  set 

Find  Stepping  Stone  distance  between  clusters 
Also  max  Stepping  Stone  distance  within  clusters 

Initialize  "group"  to  mean  "cluster"  for  Basic 
Geometric  Analysis  open  subroutine 
Execute  open  subroutine 

Quadratic  equation  enclosure  for  each  cluster 


Print  computer  time  used 

If  required  by  user,  transform  data  to  all  data  axes  or 
axes  of  some  cluster  and  go  to  S450 

If  required  by  user,  condense  clusters  to  their  CG  and 
go  to  S580 

User  decides  whether  to  stop,  to  go  to  SI 30  to  begin  a 
new  case,  or  to  go  to  S580  to  redo  the  same  case  but 
with  possible  changes  in  output  or  bonding  between 
clusters 


S9170  to  S9180+1 


In  this  section  several  basic  geometric  questions  will  be  addressed, 
such  as  the  shape  and  size  of  a  set  of  points,  the  coherence  between 
points  and  the  correlation  between  variables.  This  may  be  termed  matrix 
geometric  analysis  since  the  geometric  observations  are  simply 
interpretations  of  standard  matrix  analysis. 


The  number  of  test  points  will  be  denoted  by  m  and  the  number  of 
variables  per  test  point  will  be  denoted  by  n.  The  data  points  will  be 
written  in  an  n  x  m  matrix  A.  Here  m  may  be  much  bigger  than  n.  Also 
the  number  of  linearly  independent  rows  in  A  may  be  less  than  either  m 
or  n. 


Normalization  of  Data  Points 


In  treating  test  data  as  geometric  points  it  is  helpful  to  normalize 
coordinates  to  have  the  same  range.  For  example  a  mach  number  reading 
may  be  0.9  while  a  temperature  reading  may  be  900. 


To  normalize  a  variable,  it  is  customary  to  subtract  a  typical  value  and 
then  multiply  by  a  scale  factor.  The  typical  value  might  be  any  measure 
of  central  tendency,  like  the  mean,  the  mid  point  of  the  range,  etc. 

The  scale  factor  might  be  the  reciprocal  of  a  measure  of  dispersion  like 
the  standard  deviation,  the  range,  etc. 


The  center  of  gravity  of  the  set  of  points  is  the  point  each  of  whose 
coordinates  is  the  average  of  that  coordinate  of  the  points  of  the  set. 


Shape  of  a  Set  of  Points  Using  Eigenanal vsis 


A  basic  geometric  question  is  what  sort  of  shape  is  formed  by  these 
points,  what  sort  of  shape  would  enclose  them. 
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One  way  to  answer  this  question  is  to  consider  all  vectors  y  -  ATx 
where  x  has  unit  length  x^x  -  1.  The  possible  y's  consist  of  all 
possible  linear  combinations  of  the  points  in  A,  including  in  particular 
the  points  themselves.  The  question  of  which  y  has  maximum  length  is 
answered  by  choosing  x  so  as  to  maximize  yTy  subject  to  the  constraint 
xTx  -  1.  This  written  with  a  Lagrangian  multiplier  s  requires 
maximization  of 

xTAATx  -  s(xTx  -  1) 

which  occurs  when 

AATx  -  s  x  -  0 

Thus  the  x  that  minimizes  y  is  the  eigenvalue  of  AAT  with  greatest 
eigenvalue  s  and  the  corresponding  value  of  yTy  is  s. 

The  successive  eigenvectors  and  eigenvalues  give  the  same  answer  hut  in 
directions  orthogonal  to  the  prior  y's.  If  x1  and  x2  are  distinct 
eigenvectors 

x^  AAT  x1  -  0 

and  so  the  y's  are  also  orthogonal 

y2Ty*  -  0 

The  y's  thus  generated  form  a  set  of  orthogonal  vectors  of  lengths  given 
by  the  square  root  of  the  corresponding  eigenvalues  s. 

The  y  vectors  thus  gotten  are  themselves  eigenvectors  of  ATA  since  if 
y  •  ATx  and  x  is  an  eigenvector,  then 

ATA  y  -  ATAATx  -  sATx  -  sy 

The  ellipse  satisfied  by  the  y  eigenvectors  when  they  have  the  length 
1  /2 

(i/s)  is  given  by  the  quadratic  equation 


1  .  y'CA'Ar'y 

If  y  Is  an  eigenvector  of  unit  length  and  y  Is  substituted  Into  the 

2  / 

quadratic  equation  the  right  hand  side  Is  r  /  s  so  that  r  must 
be  for  the  vector  to  satisfy  the  equation  of  the  ellipsoid.  This 
says  that  the  ellipsoid  contains  the  y  axes  as  required. 

For  example  if  m  -  3  and  n  -  2,  consider  points  (1,  0),  (0,  1),  (1/2, 
1/2).  Then 


A  - 

1  0 

AAT  -  1  0 

1/2 

0  1 

0  1 

1/2 

1/2  1/2 

1/2  1/2 

1/2 

The  eigenvectors 

and  eigenvalues  of  AA^  are 

1/71  , 

i//i  , 

14/i 

with  eigenvalue  3/2 

iy? . 

14/2  , 

0 

1 

l  a/s  , 

1//6  , 

-2//6 

0 

Since  m  exceeds  n,  the  m  x  m  matrix  AAT  is  necessarily  singular,  being 
of  rank  at  most  n. 

The  corresponding  y  -  ATx  vectors  are,  respectively 
yi/2  ,  75/2  which  Is  of  length  J 3/2 

Jill  ,  jl/Z  1 

0.0  0 

which  are  eigenvectors  of  ATA  -  1/4/5  1\  with  eigenvalues  3/2  and  1. 

\1  5/ 

The  ellipse  with  these  axes  Is  1  -  yT(ATA)-1y 


The  ellipsoid  thus  gotten  Is  representative  of  the  shape  of  the  set  of 
points,  especially  If  the  original  points  have  been  normalized  so  that 
each  coordinate  has  a  commensurate  range,  and  has  been  shifted  to  a 
central  value. 

The  above  development,  drawn  from  classical  eigenanalysi s  provides  an 
ellipsoid  which  has  the  shape  of  the  array  of  data  points.  When  m  is  a 
large  number,  computation  of  the  eigenvalues  x  can  be  an  imposing  task. 
This  Is  less  so  for  the  eigenvalues  y  since  the  ATA  matrix  is  n  x  n 
and  much  smaller  than  the  AA^  matrix.  Nonetheless  eigenanalysis  can 
be  difficult. 

It  Is  therefore  useful  to  observe  that  the  problem  when  seen  from  the 
viewpoint  of  matrix  norms  can  be  considerably  simplified. 

The  problem  of  finding  the  y  eigenvectors  may  be  stated  as:  to  find  y 
such  that  II Ay II  Is  maximized  while  l|y||  Is  kept  constant.  Here  the 
vector  lengths  are  stated  In  the  usual  Euclidean  norm: 


iiyn-  Cyi 


+  ....y;  ] 


This  approach  will  be  presented  next. 


While  the  ellipsoid  as  gotten  above  should  reflect  the  shape  of  the 
points  represented  in  A,  and  contain  the  axes,  it  is  not  clear  that  all 
points  are  necessarily  within  the  ellipsoid.  Consider  the  rectangular 
box  whose  axes  are  those  of  the  ellipsoid. 

Any  point  may  be  represented  as  a  combination  of  eigenvectors  y1  of 
unit  length: 

y  -  ri  y1  +  r2  y2  +  . . . 

Suppose  that  a  point  say  y°  of  A  were  outside  the  box  then  r? 
exceeds  for  some  1  and  y  ,  not  y  the  actual  eigenvectors 


would  have  given  the  maximum  of  y'AA'y  for  components  of  points 
orthogonal  to  the  prior  eigenvectors.  But  this  is  contrary  to 
assumption  that  y1  is  an  eigenvector;  it  follows  that  all  the  points 
of  A  are  within  the  rectangular  box. 


However  It  is  not  clear  that  there  may  not  be  points  lying  inside  the 
box  but  outside  the  ellipsoid.  Again  consider  a  point  y°  not  along  an 
eigenvector  to  be  represented  In  terms  of  the  eigenvectors  y1  of  unit 
length.  For  this  point  not  to  be  an  eigenvector  it  is  necessary  that 
when  x°  is  chosen  (so  that  y°  -  ATx°)  all  components  are  zero 
but  the  one  corresponding  to  this  point,  the  result  yTy  is  less  than 


+  r2  +  r3  +  ••• 


<  3 


Similarly  when  the  same  is  done  for  the  components  of  y  orthogonal  to 
prior  y^ 


<  8  2 
<  3  3 


When  the  point  y  is  substituted  into  the  equation  of  the  ellipsoid  the 
result  is 


2  2 

Jj  j.  _L2 

sl  +  3  2 


■r  4 

s3 


and  the  question  is  how  this  compares  with  1. 

The  proceeding  inequalities  may  be  summed:  l/ sj  times  the 

first,  (l/sp-i/sj)  times  the  second,(i/s^-l/s2)times  the  third,  etc. 
The  result  is 


-1  +  —i 


-3 - <  n-  —2  —  —3  —  .  •  . 


which  lies  between  1  and  n.  The  right-hand  side  can  be  shown  to  be  less 
than  1 

n ' ( -1  >  (r)"'1 


The  ratio  of  the  maximum  to  minimum  eigenvalues  is  one  of  the  measures 
of  the  condition  of  a  matrix. 

This  Implies  that  if  the  ellipsoid  Is  enlarged  so  that  the  unity  on  the 
right  hand  side  of  its  equation  is  replaced  by  n  -  ^|i+i 
then  it  will  enclose  the  data  points.  This  however  may  tfe  a  less 
efficient  container  for  the  points  than  the  box  having  the  same  axes  as 
the  ellipsoid. 


An  alternate  approach  is  to  seek  a  direction  y  in  which  the  rows  of  A 
have  the  greatest  projection  vector.  The  projection  components  are  the 
entries  in  p  -  Ay.  The  projection  vector  is  greatest  in  the  sense 
that  the  maximum  projection  component  is  greatest  for  all  y's  of  unit 
length.  The  maximum  of  a  vector  is  a  norm,  called  the  L,  norm  so  that 

HPlLj  -  max  jpr  p2  . pm  j 

The  best  y  solves  the  problem: 

maximize  J  flA^J^  .  -  s(  ||y||  2  -  1)  J 

where  the  usual  Euclidean  norm  constrains  y  to  have  unit  length.  (Note 
that  the  Euclidean  or  L2  norm  Is  the  usual  vector  norm;  it  Is  implied 
when  the  type  of  norm  is  not  designated.)  Whatever  y  is  found,  ||Ay|| 
has  the  form  ajT  y  where  aj  is  some  row  of  A.  Differentiating  with 

respect  to  the  components  of  y  gives 

aT  -  2  a  y  *  0 

I.  J 


upon  imposing  the  constraint.  Thus  y  is  In  the  direction  of  some  row  of 
A.  Consider  the  projection  with  row  J 


*J. 


aI  J 


‘I." 


=  II  a  j  |l  cos(aT  ,a,  )  * 


I.  ,aJ. 


Thus  the  projection  is  greatest  if  y  Is  in  the  direction  of  the  row  of  A 
having  greatest  length. 


Having  found  the  first  direction  y1  in  this  way,  other  directions  are 
sought,  each  orthogonal  to  the  prior  y's.  These  can  be  gotten  by  the 
same  approach,  provided  the  matrix  A  is  altered  so  as  to  remove  the 
projections  in  the  direction  of  the  prior  y's. 


Thus  if  a 


(10 
J. 


was  the  latest  Jth  row  of  A  and 


a(k) 
a  I. 


was  the  latest  direction  y  then  the  new  row  would  be 

a*00  •  a<k> 
a  J  a  I 


,(k+l) 


.00 


a 


00 

I 


Note  that  the  problem:  find  y  to  maximize 


II A  y  l|  j  -  s  (  ||  y  ||  2  -  1  ) 

Is  the  eigenvalue  problem  described  previously  except  that  there  the 
Euclidean,  or  l_2,  norm  {| Ay  |j  2  was  used,  while  here  the  L1  norm 
Is  used. 

Axes  as  New  Variables 

The  axes  gotten  by  L2  or  L1  eigenanalysis  provide  a  new  orthogonal 
coordinate  system  which  is  more  natural  for  the  data  points  than  the 
original  coordinate  system.  This  Is  especially  true  if  the  axes  are 
quite  short  In  some  direction.  Indicating  a  combined  variable  which  is 
nearly  constant  for  the  points  In  question.  If  these  points  are  a 
subset,  then  it  suggests  that  tr*  subset  may  be  distinguished  best  from 
other  subsets  by  the  variable  which  is  nearly  constant  for  the  subset. 


The  matrices  AAT  and  ATA  have  distinct  and  useful  interpretations  in 
themselves,  apart  from  their  usefulness  for  other  purposes. 

The  matrix  AAT  Is  an  m  x  m  matrix,  where  m  is  the  number  of  data 
points.  Its  1 ,  j  entry  Is 

l|a- . ||  Ija^.jj  cos  (ar,  a^.) 

that  is  the  dot  product  of  the  1th  data  point  and  the  jth  data  point,  or 
equivalently  the  projection  of  one  on  the  other.  Thus  the  diagonal 
entries  of  AAT  contain  the  Euclidean  length  of  the  data  points.  The 
matrix  AAT  is  therefore  the  projection  matrix. 

If  the  rows  and  columns  of  AA^  are  divided  by  the  square  root  of  their 
diagonal  entries,  the  resulting  matrix  is  composed  of  the  cosine  of  the 
angle  between  pairs  of  points.  To  the  extent  that  cosines  are  near  one, 
the  data  points  are  confined  to  a  narrow  cone. 

The  ATA  matrix  is  an  nxn  matrix,  where  n  is  the  dimension  of  the 
space,  that  is  the  number  of  variables  that  constitute  the  data  point. 
Its  i ,  j  entry  is 


iiaij  +  a2i  a2j  +  •”  ani  anj 


This  provides  the  covariance  of  the  readings  of  the  1th  and  jth 
variables  according  to  the  formula 


cov  (a^ ,  a^)  - 


—  5Iai  .a  -  5a 

n  kTl  kl  kJ  n  k=l 


n 

a. 


ki  k^ikJ 


The  last  term  simply  normalizes  A  by  subtracting  from  each  entry,  the 


mean  of  Its  column.  If  this  Is  done  first  then  the  i  j  entry  of  the 
resulting  ATA  matrix  is  n  times  the  covariance  of  the  i  and  j. 


normalized  data  reading. 
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Thus  A  A  with  the  indicated  normalization  is  the  covariance  matrix  of 


the  data. 


The  diagonal  entries  of  the  covariance  matrix  are  the  variances,  that  is 
the  standard  deviations  of  the  data,  squared. 


The  covariance  matrix  becomes  the  correlation  matrix  if  each  row  and 
column  is  divided  by  the  square  root  of  their  diagonal  entries.  The  ij 
entry  in  the  correlation  matrix  is  then  the  correlation  between  the 
readings  of  the  ith  and  jth  variables. 


The  off  diagonal  entries  lie  between  +1  and  -1  and  indicate  the  degree 
to  which  two  variables  change  together  (+)  or  oppositely  (-).  The  main 
diagonal  is  of  course  all  ones.  To  the  extent  that  two  variables  are 
correlated,  one  variable  is  superfluous. 


Another  useful  piece  of  geometric  information  is  the  volume  of  the 
data.  In  three  dimensional  space  the  volume  of  the  parallelepiped 
having  three  linearly  independent  vectors  as  edges,  is  given  by  the 
determinant  of  the  3x3  matrix  having  the  coordinates  of  the  ith  point 
in  the  ith  row.  Similarly  for  two  dimensions. 


For  n  points  in  n  dimensions,  the  volume  of  the  parallelepiped  P  is 
related  to  the  volume  of  the  n  dimensional  unit  cube  C  by 


l  0 


dV'  -  J 


where  J  is  the  Jacobian  of  the  linear  transformation  T  that  maps  the 
vertices  of  the  cube  into  those  of  the  parallelepiped.  The 
transformation  is  required  to  have  the  property  T  l1  -  x1  where  x1 

4-1 _ U).  .1.  ...4  11...  -  .  - i  li  _ _ 1 _  _ J _ 


is  the  1th  point  written  as  column  vector  and  1  is  a  column  vector 
having  1  in  the  ith  position  and  zeroes  in  the  other  n-1  positions. 
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Thus  T  I  -  A,  which  is  the  n  x  n  matrix  whose  columns  are  the  vertices 
of  the  parallelepipeds.  Accordingly  the  transformation  that  maps  the 
vertices  the  cube  into  those  of  the  parallelepiped  is  A  x*  -  x,  and 
its  Jacobian  is  J  -  det  A. 

This  argument  can  be  extended  by  use  of  generalized  Inverses  to  the 
general  case  of  m  (<  n)  points  in  n  dimensional  space,  so  that  for 
linear  independent  vertices  the  volume  of  the  parallelepiped  is  the 
square  root  of  the  determinant  of  the  m  x  m  matrix  A^A.  This  will  now 
be  done  for  the  still  more  general  case  in  which  there  may  be  fewer  than 
m  linearly  independent  vectors. 

Let  the  points  be  represented  as  columns  of  the  n  x  m  matrix  A  and 
consider  the  singular  value  decomposition,  which  is  available  for  any 
matrix.  A  may  be  written  as 

A  -  U  0  VT 

where  U  is  an  nxn  matrix  of  orthonormal  vectors  ,  V  is  an  mxm  matrix  of 
orthonormal  vectors  and  Q  is  an  nxm  matrix  whose  off  diagonal  entries 
are  zero.  The  columns  of  U  are  eigenvectors  of  AAT;  while  the  columns 
of  V  are  eigenvectors  of  ATA.  The  diagonal  entries  of  0  are  the 
nonnegative  square  roots  of  the  eigenvalues.  The  three  matrices  are 
arranged  so  that  eigenvectors  and  eigenvalues  correspond;  the  non-zero 
eigenvalues  are  the  same  for  AA^  and  A^A. 

A  volume  may  be  associated  with  AA^  or  ATA  and  the  square  root  used 
as  a  volume  measure  for  A. 


det  AAT  -  det(UQQTUT)  -  det  QQT 


det  ATA  -  det  (VQTQV  T)  -  det  QTQ 
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since  the  determinant  of  an  orthonormal  matrix  Is  1  since  U  U  -  I  and 
(det  U)2  -  1 . 

The  determinant  of  QQT  or  QTQ  is  simply  the  product  of  the  non  zero 
eigenvalues  times  any  zero  diagonal  entries.  For  the  volume  measure  it 
is  convenient  to  use  the  smaller  of  the  two  matrices. 

If  for  the  smaller  matrices,  there  are  k  non-zero  eigenvalues,  then  we 
can  say  that  the  points  are  in  a  k  dimensional  subspace  and  within  that 
subspace,  the  volume  measure  for  the  points  is  the  square  root  of  the 
product  of  the  non  zero  eigenvalues. 

To  find  the  product  of  the  non-zero  eigenvalues,  it  is  of  course  not 
necessary  to  calculate  all  the  eigenvalues.  Instead  it  is  sufficient  to 
calculate  the  left  right  decomposition  of,  say  ATA:  A^A  ■  LR,  where 
L  is  lower  triangular  with  l's  on  the  diagonal  and  R  is  upper 
triangular. 

Allowing  for  interchanges,  the  calculation  can  restrict  any  zero 
diagonal  entries  in  R  to  the  last  rows.  If  there  are  k  non-zero 
diagonal  entries  then,  we  can  say  that  the  points  are  in  a  k  dimensional 
subspace  and  that  within  that  subspace  the  volume  measure  for  the  points 
is  the  square  root  of  the  product  of  the  non-zero  diagonal  entries  in 
R. 

Because  of  round  off,  judgment  would  be  needed  to  recognize  when  a 
diagonal  entry  in  R  Is  truly  zero.  Were  this  to  occur,  the  calculation 
of  the  eigenvalues  would  be  specially  useful  since  the  eigenvalues  give 
the  individual  dimensions  of  the  solid  that  contains  the  points.  It  is 
equally  useful  to  learn  that  such  a  solid  is  extremely  thin  in  one 
dimension  as  to  learn  that  it  is  of  zero  thickness. 


This  is  the  underlying  concept  for  both  the  linear  discrimination 
between  sets  of  test  points  having  different  stability  conditions,  as 
well  as  the  recognition  of  Irrelevant  test  points. 


t/.y 


K- 


m 


% 


& 


LlWWi 


Intuitively,  the  convex  hull  of  a  finite  set  of  points  is  the  smallest 
solid  with  polygonal  faces  and  no  reentrant  corners  that  contains  all 
the  points.  Evidently  the  vertices  of  the  solid  are  all  members  of  the 
set  of  points. 

This  concept  is  of  great  importance  for  identification  of  regions  with 
the  same  stability  condition  since  it  gives  mathematical  meaning  to  the 
belief  that  the  points  with  common  stability  condition  define  a  region 
within  which  the  same  stability  condition  prevails. 

Also  if  test  points  from  two  different  stability  regions  are  close 
together,  and  are  difficult  to  distinguish,  the  mathematical 
interpretation  can  be  that  the  convex  hulls  of  the  two  regions 
intersect. 

Thus  the  mathematical  concept  of  convex  hulls  appears  to  capture 
precisely  the  intuitive  meaning  of  a  stability  region.  However  the 
convex  hull  is  a  minimal  stability  region,  whereas  the  intuitive 
stability  region  might  be  thought  to  extend  to  some  indefinite  distance 
beyond  the  test  points  known  to  be  in  the  region. 

The  mathematical  definition  of  a  convex  hull  of  a  set  of  points  S  is  the 

smallest  convex  set  that  contains  S.  A  convex  set  is  a  set  which 

1  2 

contains  the  point  x  -  a.  x  +  a,  x  where  a,  +  a,  -  1 , 

]  1  2  ^  ^ 

provided  the  points  x  and  x  are  in  the  set.  This  says  simply  that 
the  line  segment  between  every  pair  of  points  in  the  set  is  also  in  the 
set.  Thus  a  donut  with  a  hole  Is  not  a  convex  set,  while  a  donut 
without  a  hole  is. 

Here  superscipts  denote  different  points,  while  subscripts  as  .n 
x  ■  ( x i ,  *2  ,  • • • • 


denote  the  coordinates,  the  space  being  n  dimensional. 


A  hyperplane  Is  the  locus  of  points  x  such  that  an  equation  like 
c  -  a,  X]  ♦  a2  x2  ♦  ...  an  xn 

holds.  This  may  be  written  simply  as  c  -  aTx  using  vector  notation.  For 
other  points  the  right-hand  side  exceeds  c,  Indicating  that  those  points 
are  In  the  half  space  on  one  side  of  the  hyperplane.  For  still  other 
points,  the  right-hand  side  falls  short  of  c,  indicating  that  those  points 
are  in  the  half  space  on  the  other  side  of  the  half  space. 

A  hyperplane,  defined  by  a  vector  a  and  a  scalar  c.  Is  said  to  be  a 
supporting  hyperplane  of  a  set  if 


holds  for  all  points  x  of  the  set,  with  equality  holding  in  some  cases. 

Thus  the  set  Is  entirely  on  one  side  of  the  hyperplane,  with  some  points  of 
the  set  belonging  to  the  hyperplane. 

Boundary  points  of  the  set  may  be  defined  as  those  points  of  the  set  which 
are  in  some  supporting  hyperplane. 

A  vertex  Is  a  member  x°  of  the  set  which  cannot  be  represented  as  a 
convex  linear  combination,  such  as 

x°  -  a,*'  .  V2;  l.»,  «<»,.  a2<l 

1  2 

for  any  other  points  x  ,  x  of  the  set. 

An  interior  point  x°  of  the  set  is  a  point  which  may  be  represented  in 
many  ways  as  a  convex  linear  combination 

x°  .  a,x>  ♦  v2;  l.a,,a2;  0<a,.a2<l 
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of  points  of  the  set,  with  x1  being  any  vertex  of  the  set.  Thus  a  line 
through  the  point  from  any  vertex  has  points  of  the  set  on  both  sides  of 
the  point  in  question. 

A  vertex  cannot  be  an  interior  point  since  its  own  representation  on  a  line 
through  it  must  leave  one  coefficient  a^  -  1  and  the  other  a2  -  0. 

A  vertex  must  be  one  of  the  original  defining  points  of  the  convex  hull 
since  it  cannot  be  generated  from  other  points  using  convex  linear 
combinations. 

A  supporting  hyperplane  must  contain  some  of  the  original  defining  points 
since  the  points  of  a  convex  hull  which  are  in  a  supporting  hyperplane  can 
be  generated  only  as  intermediate  points  of  lines  within  the  plane,  not  of 
lines  that  pierce  the  plane,  since  there  are  no  points  on  one  side  of  the 
plane. 

An  easy  criterion  to  Identify  at  least  some  of  the  vertex  points  in  a  set 
is  that  any  point  with  a  coordinate  greater,  or  less  than  that  coordinate 
of  any  other  point  is  a  vertex.  Otherwise  the  point  would  be  an  interior 
point  x  with  a  convex  linear  representation 

12  m 

x  -  a,x  +  a0x  +  . . .  am  x 
i  l  m 

in  terms  of  other  points  of  the  set.  But  then 
xk  -  Stx^n»x  *'k  3,  -  max  x' 

which  cannot  be  since  by  hypothesis  x^  exceeds  x|,  for  any  i. 

Similarly  for  least  coordinates. 

In  case  several  points  have  the  same  maximum  coordinate,  say  x^  -  d  then 
they  belong  to  a  supporting  hyperplane  xk  -  d  and  among  these,  those 
having  some  maximum  or  minimum  coordinate  are  all  vertices. 


Indeed  we  may  see  any  vertex  with  extreme  coordinate  as  a  point  in  a 
supporting  hyperplane. 


It  Is  easy  to  see  that  linear  transformations,  of  stretching,  translation 
or  rotation  map  straight  line  segments  into  straight  line  segments,  so  that 
topological  properties  such  as  a  point  being  a  vertex,  an  Interior  point, 
or  In  a  supporting  hyperplane  are  all  preserved.  Such  transformations  can, 
however  alter  which  point  has  an  extreme  coordinate  and  could  in  fact 
produce  all  the  vertices. 

Note  that  there  can  be  at  least  2n  vertices  with  a  maximum  or  minimum 
coordinate.  Thus  In  higher  dimensional  space,  a  smaller  proportion  of 
points  in  a  set  are  likely  to  be  interior  points. 


Slmplicial  coordinates  for  an  n  dimensional  space  are  defined  in  terms  of 

10  -  i 

n+1  points  x  ,  x  ,  ...  xn+  .  An  arbitrary  point  x  has  slmplicial 


coordinates  a^ ,  a2, 


a^x^  +  a2x^  + 


and  1  ■  a^j  +  a2  + 


a„  ,  means  that 
n  +  i 


•  an+1x 


To  determine  the  simplicial  coordinates,  it  is  necessary  to  solve  the  (n  + 
1)  x  (n  +  1)  linear  system 


1  ^  a  •  1 


where  x1  in  rectangular  coordinates  occupies  the  ith  row  of  X.  The  set 
is  solvable  if  the  (n  +  1)  x  (n  +  1)  matrix  (  )  is  not  singular,  that  is 

if  the  n  +  1  vectors  x1  are  not  coplanar,  thai!  is  no  vector  b  and  scalar 
d  can  be  found  so  that  xrb  ■  d  can  be  satisfied  by  all  n+1  vectors. 

In  two  dimensions  the  three  points  are  the  vertices  of  a  triangle.  The 
sides  of  the  triangle,  extended  divide  the  space  outside  the  triangle  into 
6  regions.  Inside  the  triangle  a^ ,  a2,  a^  are  positive.  Upon 
crossing  each  side, one  of  the  coordinates  turns  negative.  Upon  leaving  the 
triangle  through  a  vertex,  two  coordinates  turn  negative.  Since  the  a. 
add  to  1, there  can  be  no  region  wherein  all  simplicial  coordinates  are 


■oy-  <■.  *r.  y.  v.  -r. 
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negative.  The  Indication  that  a  point  is  a  boundary  point  is  that  some 
simpllcial  coordinate  Is  0.  It  is  of  course  a  vertex  if  all  simplicial 
coordinates  are  0,  except  one. 


Any  point  of  the  convex  hull  of  a  set  of  points  x1 ,  x2,  ...xm  can  be 
expressed  as  a  convex  linear  combination 

TO  m 

x  -  a,x  +a-x  +...amx 
i  z  m 

where  1  -  a,  +  a0  +  . . .  a„ 
i  z  m 

and  0  -  a^  -  1  for  i  -  1,  2,  ...m 

This  Is  true  of  points  generated  as  linear  combinations  of  some  two  of  the 
original  points.  Subsequently  If  this  is  true  of  two  points  x\  x"  then, 
it  is  true  of  a  convex  linear  combination  of  them 

x  -  aV+a'  x" 

■  a  (a  jXj+ag  X2+...)+a  (a^  Xi+a2 
x2+. . .) 

,  /  /  //  //  %  /  /  //// 

■  (d  d  d  i  ) X-j  +  (d  d  2+a  ^  2^X2+  w  *  * 

since  £(aV  +a[//  A  >  *  a/+a//  -1 

and  a'a^  +a"  a^  ^  0 

and  therefore  a'a|  +a//a1//  4  0 

A  hyperplane  c  -  xTa  In  dimensional  space  can  be  defined  to  contain 
linearly  independent  points  say  x1,  x2,...xn.  If  the  n  coordinates 
of  these  points  are  written  as  the  rows  of  an  n  x  n  matrix  X  then 

Xa  -  1 
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where  1  denotes  a  column  vector  of  all  I's.  Once  a  Is  found  it  is 
frequently  desirable  to  replace  a  by  the  unit  vector  a/  |t  all  with  1/  ||a| 
written  as  the  right  hand  side. 

If  the  points  are  not  linearly  Independent  then  the  matrix  X  is 
non-singular.  A  plane  that  is  to  one  side  of  m  points  where  m  may  be 
greater  or  less  than  the  dimension  n  of  the  space,  can  be  expressed  as 


Xa^  c 


Similarly  a  plane  that  discriminates  between  a  set  of  points  whose 
coordinates  are  the  rows  of  X  and  another  set  of  points  whose  coordinates 
are  the  rows  of  Y,  can  be  expressed  as 

Xa  c  — Ya 

These  Ideas  will  be  developed  further  in  the  discussion  of  discriminant 
analysis.  They  will  there  be  used  to  discriminate  between  the  test  data 
points  exhibiting  one  stability  condition,  and  those  exhibiting  a  different 
stability  condition. 

A  different  use  of  these  concepts  is  the  identification  of  redundant 
points,  that  Is  test  data  points  which  are  in  the  interior  of  the  convex 
hull  defined  by  other  test  data  points.  Such  points  are  of  no  use  in 
defining  where  one  stability  region  ends  and  another  begins.  They  may 
therefore  be  discarded  thus  reducing  the  size  of  the  relevant  data  base. 

An  Interior  point  x  may  be  expressed  as 


0  ^a1  ^  1 

where  the  X  matrix  consists  of  the  other  data  points,  written  as  rows.  The 
data  points  already  known  to  be  redundant  may  be  omitted  from  X. 
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This  is  a  linear  programming  problem,  without  objective  function,  that  is 
It  Is  the  feasibility  problem.  It  can  be  solved  In  the  usual  way  by 
Introducing  an  artificial  objective  function  representing  the  violations  of 
the  constraints.  If  this  objective  function  can  be  reduced  to  zero  then 
the  problem  Is  feasible.  Such  a  device  Is  quite  reasonable  and  economical 
If  upon  establishment  of  feasibility,  an  authentic  objective  function  Is 
then  pursued  by  the  mechanism  already  in  place. 

However  the  artificial  objective  function  Is  less  attractive  when 
feasibility  Is  the  only  question  to  be  answered.  Thus  It  seemed  useful  to 
modify  the  usual  simplex  algorithm  so  as  to  bypass  use  of  an  objective 
function. 

A  further  compelling  reason  for  doing  this  Is  to  accommodate  the  coding  to 
the  special  situation  here,  where  a  test  point  x  once  found  to  be 
non-redundant  would  need  to  be  considered  for  the  basis  used  In  evaluating 
subsequent  test  points  for  redundancy. 

To  take  advantage  of  these  two  aspects  of  this  particular  application,  a 
special  modification  of  the  simplex  algorithm  has  been  developed.  It  has 
been  named  the  "Non-Objective  Simplex  Method." 


The  problem  addressed  Is  feasibility  of  the  linear  programming  problem: 
find  x  such  that 


x  ^  0 


where  A  is  an  m  x  n  matrix  where  m<£n,  x  is  a  vector  of  n  entries  and  b 
has  m  entries.  Feasibility  asks  whether  there  are  any  x  vectors  meeting 
these  requirements. 

The  objective  function  which  might  accompany  this  problem  is  not 
relevant  to  the  feasibility  question.  However  it  is  usual  to  introduce 
an  artificial  objective  function  and  artificial  variables  so  that  the 
augmented  problem  is  trivially  feasible  and  if  its  (artificial) 
objective  function  can  be  driven  sufficiently  large,  feasibility  of  the 
original  problem  will  have  been  established. 

There  are  two  disadvantages  of  the  usual  approach.  It  enlarges  the 
matrix  A  and  feasibility  Is  only  obliquely  addressed  and  therefore 
perhaps  inefficiently.  Its  advantage  is  that  the  code  designed  to  solve 
the  linear  programming  problem  with  objective  function  is  also  used  for 
the  prior  task  by  finding  a  feasible  starting  point.  But  this  is  no 
advantage  if  only  feasibility  need  be  shown. 

Thus  It  is  believed  that  the  approach  to  be  described  may  prove  more 
efficient  for  the  case  In  which  the  question  of  feasibility  is  the  only 
task. 

Since  this  development  was  done  separately  from  other  work  being 
reported  on,  the  notation  Is  somewhat  different. 

The  columns  of  A  will  be  denoted  as  a1,  a2,  ...  an.  A  will  be 
assumed  to  be  of  full  rank  m.  The  vector  x  Is  a  set  of  coefficients  for 
the  columns  of  A,  expressing  b  as  a  linear  combination  of  the  columns  of 
A.  Since  any  vector  can  be  expressed  in  terms  of  only  m  columns  of  A, 


w 


it  will  be  assumed  that  only  m  components  of  x  are  non-zero. 

Feasibility  Is  established  If  an  x  is  found  having  no  negative 
coefficients.  The  feasible  solution  will  be  found  by  generating  a 
sequence  of  non-feaslble  solutions  x\  x2,... which  hopefully  will 
have  fewer  and  fewer  negative  coefficients. 

Initially  x1  is  gotten  by  picking  m  linearly  Independent  columns  of  A, 
and  solving  for  the  linear  combination  x1  by  which  they  represent  b. 
Denote  by  n^  the  number  of  negative  coefficients  in  x^.  The  next 
goal  is  to  find  an  x2  which  will  have  fewer  negative  coefficients. 

To  do  thls.it  is  convenient  to  represent  all  the  columns  of  A  as  linear 
combinations  of  the  basis  columns.  These  representations  are  collected 
In  a  matrix  E,  whose  ij  entry  Is  the  coefficient  of  the  1th  basis  vector 
in  the  representation  of  the  jth  column  of  A.  If  the  jth  column  is  a 
member  of  the  basis,  say  the  kth  member,  then  column  j  of  E  is  all  zeros 
except  that  ekj  Is  1.  This  matrix  E  is  essentially  the  "tableau" 
spoken  of  In  linear  programming  literature. 

Thus  If  B  consists  of  the  m  columns  of  A  which  constitute  the  basis  then 


E  -  B“!*  X1  -B  _1b 


so  that  the  columns  of  A  In  the  basis  become  columns  of  the  Identity  in 


Consider  what  advantage  there  might  be  In  replacing  some  member  of  the 
basis  used  In  x1  by  column  k  of  A. 

If  the  basis  consists  of  columns  numbered  c1 ,  c2,..£m  of  A  then 

■  Z*,k  a  “ 

and  Is  represented  In  E  by  Celk,  e2k,  ...]T  as  the  kth  column. 

The  representation  of  b  using  x1  Is 
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but  Is  also  trivially  equal,  for  any  multiplier  r,  to 
m  1  cl  m  cl  k. 
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To  represent  b 

in  terms  of  a  basis 

all  that  is  needed  is  to  set 


r  -  x  /e 

3  3k 

assuming  that  e^  4  0.  The  fact  that  e^  is  not  0  assures  that  the 
new  basis  would  still  have  no  more  than  m  linearly  independent  members. 

2  1 

Since  the  coefficients  In  x  are  -  re1)c  and  r  it  is  easy  to 
count  how  many  are  negative  and  therefore  whether  introduction  of  column 
k  into  the  basis  can  be  advantageous. 

1  2 

If  ^>0  and  r<  x^  /e1k  then  x^  >  0 


If  e11c<0  and  x^/e^  4  r  then  x1  2  0 


If  e^  -  0  then  x2  -  x1 


Since  if  r  -  0  then  x^  ^  0  the  lower  bounds  on  r  should 

also  Include  0. 

Denote  the  least  and  greatest  of  the  lower  bounds  on  r,  (including  0)  by 
r1  and  r2>  respectively;  denote  the  least  and  greatest  of  the  upper 
bounds  by  r3  and  r4,  respectively.  By  their  definition 

r,  4  r2  and  4  r4 


The 


If  r2  4  r3  then  all  bounds  on  r  will  be  satisfied  if  k  displaces 
either  the  basis  vector  determining  r2  or  that  determining  r3. 
number  of  negative  coefficients  in  x2  would  then  be 

2  1 

n  (k,r)  -  number  of  x^  <0  for  which  -  0 

If  ri  <  r3  <  r2  <  r4.  then  some  bounds  on  r  are  violated  and  r 
is  best  chosen  in  the  interval 

r3  <  r  <  r2 

since  choice  of  r  outside  this  Interval  would  violate  upper  or  lower 
bounds  without  a  compensating  satisfaction  of  lower  or  upper  bounds. 

If  r3  <  rl  <  r2  <r 4  or  ri  r3  r4  <r2*  r  should 
again  be  chosen  in  the  interval 


as  there  Is  only  disadvantage  outside  this  Interval. 

If  r3  <rl  <r4  <r2  or  r3  <r4  <ri  <r2*  r  1s  a9ain 

to  be  chosen  in  the  Interval 


which  here  says  to  examine  all  possibilities. 

Thus  in  all  cases  r  Is  to  be  chosen  in  the  closed  interval  extending 
between  r2  and  r3.  It  Is  In  general  necessary  to  search  for  all 
bounds  In  the  Interval  to  see  how  many  are  violated.  The  only  values  of 
r  to  be  evaluated  are  the  values  equal  to  a  bound.  Then 

2 

n(k,r)  -  [number  of  upper  or  lower  bounds  which  are  violated] 

+  [number  of  x^  1  <0  for  which  e1k  -  0] 
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This  procedure  is  done  for  all  k,  or  at  least  until  an  n*’  is  found 
which  is  less  than  n1 . 


If  at  the  best  value  of  r,  there  is  only  one  bound,  say  for  the  basis 

vector  number  c  ,  which  Is  column,  say  h,  of  A,  then  column  a  of  A 
P  h 

replaces  column  a  in  the  basis  and  the  c„  th  basis  vector  becomes 
k  h  P 

column  a  ,  rather  than  column  a  . 


In  case  several  bounds  are  simultaneously  satisfied  at  the  best  value  of 
r,  k  replaces  only  one  of  them  in  the  basis  -  which  one  may  be  chosen 
arbitrarily.  Thus  the  basis  still  has  m  linearly  independent  vectors 
even  though  b  can  be  expressed  with  fewer. 


2  1 
It  is  clear  from  the  foregoing  how  x  is  computed  from  x 


Xi  2  "  Xi1  "  reik  for  i 


Also  E  used  for  the  basis  associated  with  x1 ,  which  can  be  labeled 

1  7 

E  ,  can  be  easily  updated  to  the  matrix  E  associated  with  the  basis 


associated  with  x‘ 


Column  a1  of  A  was  represented  by  the  jth  column  of  E1 : 


j  m  ci 

-  Z  e  a 
1-1  ij 

In  particular 

km  cl 

-He  a 
1-1  Ik 

which  implies 


cp  m 


-]T  e  /e  a  +  1/e  a 
1-1  Ik  pk  pk 


.'«.•«  Vi 


where  e^  was  not  0  since  the  basis  vector  cp  replaced  by  column  k 

was  a  bound  for  r  and  so  had  its  e„u  non  zero. 

pk 

C  1 

Replacing  a  in  the  equation  for  aJ,  gives 

j  m  cl  k 

a  -  -J  [  e  -e  e  /e  la  Ve  /e  \a 
1-1  ij  pj  ik  pk  v  Pj  pic/ 

IfP 

Thus 

2  1  112 
i  -  e  -  e  e  /e  for  i  ^  p 
ij  ij  Pj  ik  pk 

2  1  1 

!  -  e  /e 

pj  Pj  Pk 

2 

Note  that  this  implies  that  column  k  of  E  is  all  0,  except  for  ep^ 
which  is  1. 

Also  whereas  In  E1  column  h  had  been  all  0  except  for  epk  1  which 
was  1,  in  E2  this  column  becomes,  according  to  the  above  formula 

2  1  1 
!  -  -  e  /e  for  i  #  p 

1h  ik  pk 

and 

eph  2  -  i/epk  1 

Other  columns,  say  column  q,  which  are  in  both  the  basis  associated  with 
1  2 

x  ,  as  well  as  the  basis  associated  with  x  retain  their  form  -  all 
0  except  for  a  single  1  -  since  e^  1  was  0  in  E1  being  the 
coefficient  of  acp  in  the  representation  of  aq  which  was  also  in  the 
basis.  Thus  e1q  1  -  e^  \ 

We  see  therefore  that  all  E  matrices  contain  an  mxm  identity  matrix, 
in  permuted  form.  By  recording  which  columns  of  A  correspond  to  which 
columns  of  the  remaining  columns  of  E,  as  well  as  which  correspond  to 
the  rows  of  E,  it  is  possible  to  omit  storage  of  the  mxm  identity. 


The  procedure  will  have  found  a  feasible  point  once  all  coefficients  of 
x  have  been  made  non  negative,  1,e.  once  n^  has  become  0. 


It  is  easy  to  observe  schematically  what  difficulties  can  arise  in  the 
choice  of  r,  by  use  of  the  following  schematic.  A  line  representing 
values  of  r  ranging  from  negative  to  positive  values  is  marked  on  the 
upper  side  with  arrows  at  the  upper  bounds  on  r,  and  on  the  lower  side 
with  arrows  at  the  lower  bounds  on  r.  The 

-  in  i1  +  in  x 1 

UBPer-bQlLnJ,s,...e_^.°-  rrn  B  rr 

Lower  bounds  e  <0  ^  ^  J  T”  T  _ 

+  iintl  0  in  ex'  ^irTx^ 

satisfaction  of  the  upper  bounds  tends  to  decrease  r,  while  the 

satisfaction  of  the  lower  bounds  tends  to  increase  r. 

The  last  point  x1  is  represented  by  r  -  0,  where  the  lower  bound  on 
the  new  column  k  is  located. 

If  r  is  increased  from  0  and  the  first  bound  it  encounters  is  a  lower 
bound  then  that  choice  of  r  eliminates  a  negative  coefficient  and 
includes  a  new  positive  coefficient,  leaving  other  coefficients  of  x 
unchanged  in  sign.  t 


If  however  the  first  bound  was  in  upper  bound,  that  choice  of  r 
represents  no  net  change  in  the  types  of  coefficients  of  x,  i.e.  +,  0, 


If  r  was  decreased  and  the  first  bound  encountered,  other  then  that  at 
0,  was  an  upper  bound,  that  choice  of  r  represents  no  net  change  in  the 
type  of  coefficient. 


If  r  was  decreased  and  the  first  bound  encountered,  other  than  at  0,  was 
a  lower  bound,  that  choice  of  r  represents  a  switch  of  a  +  coefficient 
to  a  -  coefficient.  I 


s  s  "v  .a  v  ‘  y*  v 


w: 


Thus  the  simple  strategy  of  choosing  r  at  an  upper  or  lower  bound 

nearest  r  -  0,  which  represents  x1,  Increases  the  +  coefficients  In 
2  1 

x  over  those  In  x  only  If  r  Is  Increased  and  the  first  bound 
encountered  Is  a  lower  bound. 


It  is  of  course  a  less  complex  procedure  simply  to  search  the  full  range 
of  bounds  on  r. 

As  to  which  k  column  is  most  advantageous  for  inclusion,  one  criterion 
is  to  choose  at  least  initially  the  column  in  E  which  has  most 
uniformity  in  sign,  since  then  most  bounds  in  0  are  either  upper  or 
lower  bounds,  and  can  easily  b.  satisfied  en  masse. 


o  Choose  basis  among  vectors  with  extreme  values  max,  min  of  any 
component. 

o  Form  triangular  decomposition,  modifying  choice  of  vectors  as 
decomposition  proceeds. 

o  Record  basis  names  In  vector  IR. 

o  Apply  triangular  factors  to  columns  of  A  not  in  basis.  Form  matrix 
E.  Keep  running  count  of  sign  diversity.  Put  vectors  with  extreme 
value  first.  Record  column  number  of  non  basic  vectors  in  vector 


o  If  all  entries  in  column  have  same  sign  +  record  for  Inclusion.  If 
all  entries  have  sign  -  then  a  feasible  point  has  been  found. 


-  -  ‘  - fVk 


o  Apply  factors  to  b  to  get  x  .  Record  number  of  minus  signs  In  n 

o  Update  E,  x,  IR,  IC.  Repeat  preceedlng  step. 

o  Calculate  for  each  k 

lower  and  upper  bounds,  0^,  Oj,  Og,  0^ 
pick  the  0  with  the  maximum  advantage 

Quit  if  all  x  0. 

o  Calculate  for  each  k:  lower  and  upper  bounds  r^ ,  r2,  r^,  r^. 

Pick  the  r  with  the  maximum  advantage. 

o  Update  E,  x,  IR,  IC.  Repeat  preceding  step. 

o  Quit  if  all  x  non-negative. 


The  task  addressed  by  discriminant  analysis  Is  to  find  a  surface  that 
separates  two  sets  of  points.  If  the  coordinates  of  a  point  are  substituted 
into  the  mathematical  equation  of  the  surface:  f  (x^  x2,  ...)  -  C  the 
point  is  on  one  side  of  the  surface  if  the  result  exceeds  C,  Is  on  the  other 
side  If  the  result  falls  short  of  C  and  Is  exactly  on  the  surface  if  the 
result  equals  C.  The  surface  Is  said  to  discriminate  between  the  two  sets  of 
points. 

The  simplest  type  of  surface  is  the  plane,  or  as  is  sometimes  said  for  higher 
dimensional  space,  the  hyperplane.  The  equation  for  a  plane  is  of  the  form: 
£a-x.«C,  that  is,  it  is  a  linear  function  of  the  x.  as  well  as  a  linear 
function  of  the  coefficients  a^ . 

It  Is  possible  to  devise  methods  for  nonlinear  expressions.  For  example,  the 

2  2 

ellipse  in  two  dimensions  has  the  form:  ajXj  +  a2x2  »  C. 

However,  the  methods  for  linear  expressions  can  address  this  problem  by 

o 

considering  the  expression  as  linear  In  the  transformed  variables  x.  and 
2 

x2 

Thi s  device  cannot  treat  expressions  in  which  the  unknown  coefficients  do  not 
appear  linearly;  such  as  a^  a2+a3x2  a4*C  or 
a1  sin  (a2Xj)+a3  sin  (a4x2)-C. 

Discriminant  analysis  can  be  viewed  as  geometric  or  as  probabalistlc.  The 
geometric  view  Is  expressed  above.  The  probabalistlc  view  seeks  a  surface 
which  discriminates  with  a  high  confidence.  It  can  address  situations  in 
which  the  sets  of  points  substantially  Interpenetrate  one  another. 

For  the  flutter  prediction  problem,  it  Is  believed  that  linear,  geometric 
discriminant  analysis  Is  the  basic  tool  and  that  it  will  prove  adequate, 
especially  when  enhanced  with  nonlinear  combination  variables.  Any 
Interpenetration  of  stability  regions  is  believed  attributable  to  imprecision 
of  measurement  or  to  higher  order  effects  not  modeled  here.  In  either  case, 
the  degree  of  interpenetration  is  expected  to  be  small. 


s 
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The  empirical  flutter  prediction  problem  Is  directly  addressed  by 
discriminant  analysis.  The  Annular  Cascade  Data  Base  consists  of  a  large 
number  of  test  points  each  identified  as  being  in  one  out  of  fourteen 
possible  stability  regions.  Taking  these  regions  two  at  a  time,  it  is 
desired  to  find  a  plane  that  separates  each  region  from  the  other.  To 
conclude  that  a  test  point  is  in  a  stability  region,  it  would,  Ideally, 
need  to  be  on  the  side  of  each  of  the  thirteen  planes  that  discriminated 
that  stability  region. 

GALACTIC  contains  two  linear,  geometric  discriminant  analysis  methods. 

Both  utilize  linear  programming. 

The  first  method  addresses  the  easier  question:  whether  there  is  a 
discriminating  plane.  This  question  can  be  answered  without  actually 
exhibiting  the  plane.  The  second  method  addresses  the  more  difficult 
problem  of  finding  the  plane.  The  first  method  will  be  called  the 
Discriminant  Feasibility  method. 

The  first  method  is  valuable  because  It  can  identify  the  smallest  number  of 
variables  for  which  the  sets  are  separable.  The  first  method  is  quick  and 
robust.  The  second  method  which  is  more  costly  and  less  robust  can 
therefore  avoid  unprofitable  exploration  of  small  subspaces. 

The  second  method  has  been  modified  to  become  a  third  method,  called  the 
Relaxed  Discriminant  method.  It  seeks  to  find  a  pair  of  parallel  planes 
that  discriminate  the  stability  regions.  Between  the  planes  there  is 
either  a  gap  or  an  overlap.  Thus  the  third  method  should  provide  an 
approximate  solution  even  when  the  regions  are  not  truly  separable. 

Consider  that  there  are  two  sets  of  points  in  m  dimensional  space,  one  with 
n1  points  and  the  other  with  n2  points.  These  points  can  be 
represented  respectively  as  the  rows  of  an  n^*m  matrix  and  the  rows 
of  an  n2*m  matrix  X2. 


The  first  method  will  simply  determine  whether  the  two  sets  of  points 
intersect,  in  the  sense  that  their  convex  hulls  intersect.  In  two 
dimensions  the  convex  hull  Is  the  smallest  polygonal  region  that  contains 
all  the  points  and  has  no  reentrant  corners.  Algebraically  a  point  z  is  in 
the  convex  hull  of  the  n1  points  In  X^  if  z  can  be  represented  as  a 
linear  combination  of  the  rows  of  X, 


z  -  aX, 


a^  0 


where  a  is  a  row  vector  whose  components  are  non  negative  and  which  add  to  1 


-  1 

i  -  1 

Now  z  is  in  both  sets  If  there  is  also  a  row  vector  b  such  that 

Z-  b  X2 
n2 

H  b,  -  I  -  0 

1  -  1 

The  problem  of  determining  whether  there  are  points  z  which  are  In  both 
convex  hulls  may  be  converted  to  a  standard  linear  programming  problem  as 
follows.  Equate  the  two  expressions  for  z: 


aXj  -  bX2  *  0 


(a  b) 


3) 


The  Jja^  -  1  and^b^  -  1  requirements  can  be  incorporated 


(a  b) 
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Combining  a  and  b  into  y  -  (a,b),  this  equation  becomes 


1  T  '  T  1 
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Thus  it  is  required  to  find  an  n  entry  non-negative  column  vector  y  which 
satisfies  the  above  m+2  equality  constraints. 

The  objective  function  is  of  secondary  importance  since  the  basic  question 
here  is  whether  there  is  a  feasible  solution,  that  is,  whether  the  two 
convex  hulls  intersect.  If  they  do,  then  the  two  sets  cannot  be 
discriminated  linearly. 

Since  linear  programming  codes  generally  require  an  objective  function 
Ciyi ,  it  is  convenient  to  pick  the  so  as  to  emphasize  the  role  of 
certain  points  in  X1  or  X2» 

However,  the  easy  choice  of  making  the  C.  for  X1  equal,  and  those  for 

X2  equal  is  not  advisable  since  In  view  of  the  constraints 

£a^  -  l.^b^  -  1,  such  an  objective  function  would  reduce  to  C1  + 

Cnl+1  constant.  While  inconsequential  for  the  feasibility 

question,  a  constant  might  Inadvertently  cause  programming  problems. 

Thus  to  conclude  that  two  sets  cannot  be  linearly  discriminated,  it  Is 
sufficient  to  show  that  a  certain  linear  programming  problem  is  feasible. 
Conversely,  if  the  linear  programming  problem  is  not  feasible,  then  the 
convex  hulls  of  the  two  sets  do  not  intersect  and  it  is  therefore  possible 
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to  find  a  plane  that  lies  between  them.  Intersecting  neither  set.  The 
plane  Is  used  to  discriminate  between  the  two  sets.  Note  that  this  method 
does  not  produce  such  a  discriminating  plane. 

Determination  of  the  Discriminating  Plane 

The  second  method  Is  more  complicated,  but  It  accomplishes  more.  It 
Identifies  a  discriminating  plane,  that  Is,  a  plane  that  separates  two  non 
intersecting  sets. 

Any  point  x  that  satisfies  xp  ■  d  Is  on  a  plane  having  column  vector  p  and 

2 

perpendicular  from  the  origin  and  situated  a  distance  d/  j|p  il  from  the 
origin. 

Such  a  plane  would  discriminate  the  sets  in  X1  and  X2  If 

X]  P  —  d 
X2  P  —  d 

or  equivalents. 


or  abbreviating 
Zq^O 

where  Z  Is  n*(m+l),  n  being  n^+ng,  and  q  Is  an  m+1  vector. 
A  few  comments  are  appropriate. 


There  Is  no  Inequality  constraint  on  the  components  of  p  or  on  d;  hence  it 
is  no  restriction  to  choose  that  X^  exceed  d  and  for  X2p  to  be  less 
than  d,  rather  than  vice  versa. 

The  problem  as  stated  allows  the  two  sets,  and  therefore  the  discriminating 
plane,  to  have  common  boundary  points.  This  could  be  prohibited  by 
requiring 

X,  P^d!  ;  X2  p  —  d2;  ^1*^2  “  ^ 

or  Z  q-t 

for  some  chosen  k  ^  0.  Z  would  have  one  extra  column  as  well  as  an  extra 
row  for  the  d^-d2  k  constant.  If  alternatively  k  were  allowed  to  be 
slightly  negative,  then  some  overlap  between  the  two  sets  would  be  allowed. 

While  the  problem  statement  Involves  linear  inequalities,  it  is  not  in 
standard  linear  programming  format  since  there  are  no  inequalities 
constraining  q.  The  following  converts  the  problem  to  standard  format. 

Write  Zq  -  t  t  ^  o 

and  utilize  the  so  called  generalized  Inverse  Z+  for  Z 

q  -  Z+  t. 

Digression  on  Generalized  Inverses 

A  digression  about  generalized  inverses  is  appropriate  here.  First  it  will 
be  explained  how  best  to  compute  the  generalized  Inverse,  and  then  what  is 
its  usefulness. 


The  best  way  to  compute  Z+  Is  to  compute  the  singular  value  decomposition 
of  Z.  This  is  a  representation  of  the  n  x  m  matrix  Z  as  the  product 


where  U  Is  n*n  and  orthogonal  (UT-U-1),  V  is  m  x  m  and  orthogonal 
(VV1)  and  Q  Is  an  n*m  matrix  which  Is  zero  except  for  the  diagonal 
terms  which  may  or  may  not  be  0.  The  generalized  Inverse  Q+  of  0  Is 
defined  as  the  m  x  n  matrix  which  Is  zero  except  possibly  for  the  diagonal 
entries.  The  11  entry  is  0  If  -  0  and  is  1  /q . .  if  q^O. 

Thus  IQ  ■  0  0+  Is  an  n  x  n  matrix  which  Is  zero  except  that  Its  11 
diagonal  entry  Is  1  when  q^O. 

Now  the  generalized  Inverse  of  Z  Is 

Z+  -  V  Q+  UT 

As  a  result, 

ZZ+  -  U  0  VT  V  Q+  UT 

U  0  Q+  UT 

u  I0  UT 


If  for  Instance  It  were  square  and  non-singular,  then  0  would  have  no  zeroes 
on  its  diagonal  and  accordingly  IQ  would  also  not  have  zeroes  on  its 
diagonal  so  that  ZZ+  •  UIUT  -  I  and  Z+  is  the  (usual)  Inverse  Z-1. 


If  Zq  -  t  represented  a  least  squares  problem  In  which  nasm  and  Z  had  rank 
m, then  the  usual  solution  Is 


q  -  (Z+Z)_1  ZTt 


which  Is 


(vqVuqvV1  VQTUTt 


(VOW1  VQTUTt 


(V7)"1  (QT0)_1  V"1  VQTUTt 


-  v  (oV1  oTuTt 
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But  since  Z  had  rank  m,then  Q  could  not  have  zeroes  on  Its  diagonal,  Q'Q 
is  an  m  x  m  non-singular  diagonal  matrix  and  (0^0)_10^  -  Q+.  Thus 
the  usual  least  squares  solution  is  also  the  generalized  Inverse 

q  -  VQ+UTt 

Other  situations  which  have  been  treated  In  traditional  ways  could  also  be 
presented  as  persuasive  of  the  reasonability  and  practicality  of  the  stated 
generalized  Inverse.  However,  It  Is  perhaps  more  useful  now  to  give  the 
following  derivation. 

Since  any  matrix  whether  square  or  rectangular,  singular  or  non-singular, 
has  a  singular  value  decomposition,  the  problem  Zq  -  t  can  be  written 
equivalently  as 

UQVTq  .  t 


QVTq  -  UTt 


Qr  -  s 


where  $  -  U  t  and  q  can  be  gotten  from  r  by 


q  -  Vr 


The  equation  Qr  -  s,  spelled  out  is  simply 


<*llrl  “  S1 


q22r2  "  S2 


If  q^  ^  0,  then  r^  -  s^/q^.  But  If  q^  -  0  then  r^  Is 

arbitrary.  According  to  the  generalized  Inverse  defined  above,  r.  is  set 

to  0  when  q^  «  0.  This  says  that 

r  -  Q+s 

But  then 

q  -  Vr 

-  VQ+s 

-  VQ+UTt 

which  Is  the  formula  presented  above. 

Peterml nation  of  the_ PI scrl mi natl ng  PI ane  (continued) 

For  the  discrimination  question.  It  Is  necessary  to  ask  when  the 
generalized  Inverse  Is  an  exact  solution.  This  is  the  case  provided  that 
Zq  equals  t,  that  Is,  provided  ZZ*t  -  t  or 

CZ(VQ+UT)  -I]  t  -  o 
[UQVTVQ+UT  _i]  t  -  o 
[UQqV  -I]  t  -  o 

Here  00+  Is  an  n  x  n  diagonal  matrix,  say  D  whose  11  entry  Is  1  If  q..  jt 
0  and  0  if  q^  -  o.  Thus  the  above  may  be  written 

[UDUT  -I]  t  -  o 

or 

[U(0-I)UT]  t  -  o 
or 


UI,  UTt  -  o 


Since  U  Is  non-singular,  this  may  be  written  simply  as 


I^t 


where  1^  Is  an  n*m  diagonal  matrix  whose  11  entry  is  0  If  q^  f  0  and 
+1  If  q^  -  0.  This  Is  the  necessary  and  sufficient  condition  that 


VQ+UTt 


is  an  exact  solution  for  Zq  -  t.  This  might  seem  to  be  a  virtually 
impossible  condition  to  meet.  Indeed  when  Z  is  rectangular  and  of  rank 
less  than  m,  as  In  a  least  square  problem,  it  is  not  likely  that  q  can  be 
found  which  will  exactly  satisfy  Zq  -  t  for  a  preassigned  t. 

But  here  the  t  is  not  preassigned;  any  t  with  non-negative  components  will 
do.  Indeed  t  Is  simply  an  eigenvector  of  UI,UT  which  corresponds  to  a 

1  T 

0  eigenvalue.  The  necessary  and  sufficient  condition  UIjU  t  -  0  can  be 
simplified  to 

I jUTt  -  0 

since  U  Is  square  and  non-singular. 

For  the  discriminant  analysis  task  at  hand,  finding  a  q  such  that  Zq  is  non 
negative  is  seemingly  effortlessly  satisfied  by  choosing  a  non-negative  t 
(that  satisfies  any  other  constraints)  and  calculating 

q  -  Z+t 

provided  that  this  implies  Zq  ■  t.  This  implication  is  valid  if 
ZZ+t  -  t 

which  is  developed  above. 


^  rJr>frV  f » * 


The  problem  of  finding  a  discriminating  plane  between  two  sets  has  thus 
been  reduced  to  the  following  linear  programming  problem  In  standard 
format:  find  t  such  that 


I,  U't  -  0 


t  ^  0 


This  Is  a  system  of  at  least  n-m  equations  In  n  unknowns:  t1 ,  t2, 

•  •  *  tn  * 

The  solution  of  the  system  of  equations  can  actually  be  expressed  as 
t  -  UI2c 

where  c  Is  arbitrary  and  I 2  -  I-I^  which  Is  D  defined  above.  To  show 
this,  substitution  gives 

I1UTUI2c  -  1}  <I-I1 )c  -  0 

Thus  t  is  a  linear  combination  of  the  rows  of  UT  which  are  not  in  the 
system  I1UTt  -  o. 

Clearly  then  the  constraint  IjU  t  ■  0  describes  an  unbounded  region. 

If  any  vector  in  this  region  satisfies  t-^  0,  an  arbitrarily  large  multiple 
of  the  vector  also  Is  feasible.  Thus  the  feasibility  region  if  it  exists 
Is  unbounded. 

The  linear  programming  problem  thus  posed  has  a  finite  solution  only  if  the 
objective  function  seeks  small  rather  than  large  feasible  points.  But 
seeking  small  feasible  points  leads  to  the  origin  which  Is  clearly  a 
feasible  point.  Thus  the  linear  programming  problem  leads  to  either  ill 
defined  infinite  solutions  or  the  trivial  solution. 

To  avoid  the  dilemma,  a  constraint 
Di  •  i 
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may  be  Included,  together  with  an  objective  function  such  as 
maximize  )_t. 

Since  the  feasible  region  if  it  exists  had  been  unbounded,  this  objective 
function  will  locate  the  optimum  on  the  new  constraint,  always  giving  1  as 
the  value  of  the  objective  function. 


The  objective  function  is,  however,  not  without  meaning.  For  a  point  in  the 
first  set,  its  t  is  xp-d,  while  for  a  point  in  the  second  set,  its  t  is 
d-xp,  according  to  X^p  —  d—  X^.  This  says  that  t  is  the 
perpendicular  distance  of  the  point  from  the  plane.  Thus  the  stated 
objective  function  is  the  sum  of  the  perpendicular  distance  of  the  n  points 
from  the  plane.  The  points  in  one  set  being  on  one  side  of  the  plane;  and 
the  points  in  the  second  set  being  on  the  other  side.  Thus  the  objective 
function  seeks  to  maximize.  In  this  sense,  the  distance  of  the  sets  from 
the  plane. 

If  the  linear  programming  problem  does  not  have  a  solution,  then  there  is 
no  plane  that  separates  the  two  sets. 

If  the  linear  programming  problem  does  have  a  solution  t,  then 
q  -  V0+UTt 

The  first  m  components  of  q  constitute  the  vector  p,  and  the  last 
component,  the  (m+1)  st  is  the  distance  d. 

This  p  and  d  satisfy 


X,p  -  d, 


X2p  4  d 


so  that  the  plane  xp-d  discriminates  between  the  two  sets. 


'.-;yy/ ^ -.'vCy.vv 


Tel:  Discriminating  planes  exist 


The  points  in  X1  are  denoted  by  0,  and  those  in  X2  by  X. 

To  apply  the  first  method,  set  up  the  LP  problem:  find  y^O  such  that 


2  3  0  0 

(° 

00-2-3 

y  - 

0 

110  0 

1 

0  0  11 

1 

There  is  a  unique  solution  to  the  equality  for  y 


y  - 


3 

-2 

3 

-2 


which, however, violates  the  requirement  y-  0.  Thus  the  LP  problem  is 
non  feasible  and  according  to  the  first  method  the  points  can  be 
linearly  discriminated.  Note  how  easy  the  first  method  is,  but  how 
minimal  is  the  information  it  supplies. 


The  second  method  requires  finding  p  and  d  so  that 


where  t  solves 


I,UTt  -  0  t^o 

where  1^  -  0  0  0  0 

0  0  0  0 

0  0  0  0 

0  0  0  1 

Thus  t  solves 

(-.58834842  +.39223228  -.58834839  +.39223226) 

t2 
t3 

1*4 

which  allows  arbitrary  choice  for  t] ,  t2,  t3  whereupon  t4  is 
determined  by 

t4  -  (3/2)  t1  -  t2  +  (3/2)  t3 

However,  the  condition  that  t  ^  0  requires  that 

(3/2)  (t1  +  t3)  ^  t2^  o 

To  get  p  -  VQ+UTt,  the  equation  for  Q+  is 
d 

Q+  -  /. 2433852  000 

0  .2773501  0  0 

0  0  2.9052994  0 


so  that 


VQ+ 


,15086421  -.19611613  -.98855428  0 
,15086421  .19611613  -.98855428  0 
,11711671  .37476796E-8  -2.5468183  0 


VQ+UT 


-.42307674  .61538439  .5769228  -.3846161 
-.5769228  .3846152  .4230767  -.6153844 
■1.4999998  .9999999  1.4999998  -.9999999 


The  more  exact  answer  is 


-  1/26 

-11 

16 

15 

-10 

-15 

10 

11 

-16 

-39 

26 

39 

-26 

so  that 


=  voV 


-  1/26 

- 

11 

16 

15 

-10' 

*1 

- 

15 

10 

11 

-16 

t2 

\  - 

39 

26 

39 

-26 

*3 

i  3/2 

-  1/26 

f 

-26| 

1 26 1 

0 

fcl 

-39 

+t2 

26 

+t3 

-13 

i  -78 

52  | 

i  o 

T'T 


'  1  ! 

1 

1 .5 

+t2 

1 

|  +t3 

-0.5 

3J 

2  | 

,  0 
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As  a  check,  compute 


2 

0 

-i! 

[pi 

1  1  1 

t  0  I 

1  0 

3 

0 

-l 

p2 

“  tl 

0 

+t2 

1 

+t3 

0 

0 

-2 

Id 

0 

,  °| 

1 

0 

-3 

)i 

1 

1.5i 

1-1  1 

1  1.5 

The  components  are  all  obviously  non-negative  except  for  the  last  which 
is 

3/2  (t]+t3)  -  t2 

and  non-negative  by  choice  of  t2  in  the  range 
3/2  (t1+t3)  *  t2^  0 


While  the  singular  value  decomposition  is  the  more  useful  general 
approach  to  computing  generalized  inverses,  the  generalized  inverse  can 
sometimes  be  computed  more  simply  and  exactly  as 

Z+=(Z+Z)_1ZT 


Thus  in  example  1 


«  1/26 


-  1/26 


27 

25 

65  | 

ZT 

25 

27 

65 

65 

65 

169 

-11 

16 

15 

-10 

-15 

10 

11 

-16 

-39 

26 

39 

-26 
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Hence  the  equation  for  t  is 


-  (ZZ4 

-i) 

t 

1/26 

17 

6 

-9 

6 

6 

22 

6 

-4 

-I 

-9 

6 

17 

6 

- 

6 

-4 

6 

22 

- 

1/26 

-9 

6 

-9 

6 

6 

-4 

6 

-4 

t 

-9 

6 

-9 

6 

6 

-4 

6 

-4 

which  reduces  to  the  single  equation 
0  -  -(3/2)t1+t2  -(3/2)t3+t4 
or 

t4  .  -(3/2)t,-t2  +  (3/2) t j 

The  condition  that  t^O  is  fulfilled  if  t1  and  t2  are  arbitrarily 
chosen  as  non-negative,  t2  satisfies 

(3/2) (t,  +  t3)^  t2^  0 

and 

t4  -(3/2)(t,  *  t3)  -  t2 

From  this, the  equation  for  p  and  d  is 
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:  Discriminating  planes  exist 


n.j  -2  m-2 


To  apply  the  first  method,  set  up  the  LP  problem:  find  y  ^  o  such  that 


0  3-20 

0 

2  0  0  -3 

y  - 

0 

110  0 

1 

0  0  1  1  , 

il 

There  Is  a  unique  solution  to  the  equality  for  y 

y  -  (1/5)  3 

2 
3 
2 

which  satisfies  the  requirement  y^.o.  Thus  the  LP  problem  is  feasible 
and  according  to  the  first  method  the  points  cannot  be  linearly 
discriminated.  Nonetheless,  the  second  method  will  be  attempted  to 
show  the  manner  In  which  It  will  fail. 

The  second  method  requires  finding  p  and  d  so  that 

fo  2  -1  P, 

3  0-1  P2  ^  0 

-2  0  1  d 

|0  -3  1 
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No  non-trivia  I  solution  is  possible  since  the  inequalities  are 


3Pj  —  2p1 

2p2^  3p2 

The  first  requires  that  p1 ^  0  and  d  ^  0  while  the  second  requires 
that  P2- 0  and  d^  0.  Thus  p1  -  p2  -  d  -  0. 

The  singular  value  decomposition  of  the  above  matrix  is  given  as 

-2  3  0  0  -  UQVT 

0  0  2  -3 
0  110 
10  0  1 


where 


-.41884516  .56970937  .41884512  -.56970932 

-.39223222  .58834834  -.39223229  .5883484 

.56970932  .41884512  -.56970934  -.41884515 

i -.58834842  -.39223228  -.58834839  -.39223226 


-  4.108733 


3.6055508 


.34419858 


-  .61985786  .70710670 

.61985777  -.70710675 


.34025909 

.34025908 


-.48119902  .21 70441 4E-7  .87661133 


as  computed  by  the  IMSL  subroutine  LSVDF.  The  solution  is 


-  VQ+UTt 


where  t  solves 


f 

Is 


o 


I1U,t  -  0 


where  I. 


t=*  0 


Thus  t  solves 


(-.58834842  -.39223228  -.58834839  +.39223226) 


Thus  t] ,  t2,  t3  can  be  chosen  arbitrarily,  but 


t4  -  3/2  t1  -  t2  -  3/2  t3 


To  satisfy  the  requirement  that  t^  0, it  is  however  necessary  that 
t?  —  -  3/2  (t1  +  t3) 

This  can  be  satisfied  only  If  t  -  0.  Thus  a  discriminating  plane  cannot 
be  found.  Nevertheless  for  comparison  purposes,  the  process  will  be 
pursued  further. 

To  get  (pi  -  VQ+UTt,  the  equation  for  Q+  is 


Q+  -  /  .243384 


0  .2773501 


0  2.9053003 


so  that 


VQ+  -  .1508635  .1961161  .9885546  0 

.1508635  -.1961161  .9885546  0 

.-.1171162  6  -  01 9721 E— 9  2.5468191  0 


voV 


.4230773 

.5769309 


i 

i 


1 .5 


.6153842  -.5769234  -.384615 

.384615  -.4230773  -.6153842 

1  -1.5  -1 


or  more  exactly 


vqV 


1/26  |  11 

16 

-15 

-10 

|  15 

10 

-11 

-16 

i  39 

26 

-39 

-26 

so  that 


*  vqV 


69 


which  is  zero  since  as  noted  earlier,  t 


Typically  n  is  much  bigger  than  m.  For  example  there  might  be  n  -  1000 
points  in  an  m  -  10  dimensional  space. 

Accordingly  U  may  be  very  large:  n  x  n  and  V  very  small:  m  x  m.  The 
matrix  0  is  n  x  m  with  no  more  than  m  non-zero  diagonal  entries.  Hence 
1^  is  n  x  n  with  at  least  n-m  non-zero  diagonal  entries  and  is  n 
x  n  with  at  most  m  non-zero  diagonal  entries. 

By  suppressing  zero  rows  or  columns,  the  problem  I,UTt  -  0,  t-0 
involves  an  (n-m)xn  matrix,  having  90,000  entries,  which  is  very  large. 
But  the  problem  U^c^O  Involves  an  nxm  matrix,  of  100  entries  which  is 
much  smaller. 

Thus  the  latter  problem  is  much  more  tractable  with  respect  to  size. 
However,  it  entails  two  difficulties,  first  it  is  a  non-standard  linear 
programming  problem  and  second,  getting  to  it  requires  -  It  would  seem 
computation  of  the  very  large  n  x  n  matrix  U. 

It  turns  out  that  both  difficulties  can  be  overcome.  First  U  need  not 
be  computed  explicitly.  The  singular  decomposition  representation  of  Z 
as  UQV  where  UUT  -  I  and  WT  -  I  implies  that 

zTz  -  vVov 


Thus  the  singular  decomposition  calculation  applied  to  the  small  mxm 
matrix  Z^Z  can  supply  V  and  Q^Q.  Since  the  entries  In  0  can  be 
assumed  to  be  non  negative,  this  also  supplies  Q.  But  then 

Z  V  -  UQ 

ZVQ+  -  UD 


which  is  UI2.  Thus  the  matrix  UD  which  is  no  bigger  than  the  n  x  m 
matrix  Z  can  be  gotten  without  computing  the  very  large  n  x  n  matrix  U 
The  non  standard  linear  programming  problem 

UI2c  ^  0 

1TUI2c  -  1 

is  solved  next  using  an  approach  based  on  the  usual  simplex  algorithm. 
Note  that  once  c  is  gotten,  it  can  be  transformed  to  q  by 
q  -  VQ+UTt  -  VQ+UTUI2c 

-  vq+i2c 

-  VQ+c 

which,  again,  does  not  require  explicit  calculation  of  U. 


The  method  described  above  for  construction  of  a  discriminating  plane 
will  now  be  Improved  upon.  When  a  discriminating  plane  exists,  so  that 
there  is  a  gap  between  the  two  sets,  the  new  method  produces  a  pair  of 

parallel  planes  as  far  apart  as  possible,  for  which  the  entire  gap 

between  them  separates  the  sets.  When  a  discriminating  plane  does  not 
exist,  since  the  sets  overlap,  the  new  method  again  seeks  a  pair  of 
parallel  planes  which  are  as  close  to  one  another  as  possible,  and  for 

which  one  set  is  on  one  side  of  one  plane  and  the  other  set  is  on  the 

other  side  of  the  other  plane. 

The  new  method,  called  the  Relaxed  Discriminant  Method  is  advantageous 
since  it  is  always  able,  in  theory,  to  solve  any  problem  and  provides 
additional  information  where  the  prior,  parent  method  for  discrimination 
also  applies. 

Suppose  that  sets  X]  and  X2  interpenetrate  in  the  sense  that  there 
is  no  plane  that  lies  between  the  two  sets.  Consider  then  the  easier 
task  of  finding  a  plane  having  set  X1 ,  on  one  side,  and  a  parallel 
plane  having  set  X2  on  the  other  side 

v-  di 

X2P  4  (J2 

where  d2^  d1 .  Thus  there  can  be  points,  say  x,  which  vatisfy  both 
inequalities 

d2=*  xp^d1 . 

By  making  d2  and  dj  as  close  together  as  possibles  band  is 
determined  which  discriminates  between  the  two  sets  in  the  sense  that 
all  points  on  one  side  of  the  band  belong  to  X^ ,  and  all  points  on  the 
other  side  belong  to  X2> 
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Thus  the  band  satisfies  a  relaxed  discriminant  problem.  But  it  serves 
other  important  purposes.  First  it  identifies  the  troublesome  points, 
being  the  points  within  the  band.  These  may  be  found  to  be  suspect  for 
some  reason.  Or  the  points  may  be  found  to  have  sufficient 
observational  error  that  the  band  can  be  collapsed  to  a  plane.  If  the 
validity  of  the  points  is,  upon  examination,  not  impugned,  then  three 
approaches  are  available.  One  approach  Is  to  say  that  within  the  band, 
the  sets  cannot  be  discriminated  -  which  may  be  acceptable  if  the  band 
is  narrow.  A  second  approach  is  to  introduce  probabilities  so  that  all 
sets  are  assumed  to  interpenetrate  and  all  that  can  be  said  is  the 
probability  that  a  point  belongs  to  any  particular  set.  The  third 
approach  is  to  use  a  nonlinear  discriminant  surface,  which  would  fit 
within  the  band  separating  the  members  of  the  sets  within  the  band.  It 
would  seem  that  constructing  such  a  surface  would  be  helped  by  knowledge 
of  the  band  and  the  difficult  points  within  the  band. 

Note  that  the  case  d1  ^  d2  is  also  of  considerable  interest  since  a 
band  is  thereby  defined  which  contains  points  of  neither  set.  The  plane 
bisecting  the  band  discriminates  the  two  sets,  and  would  not  have  any 
marginal  points  which  are  on  the  border  line  between  the  two  sets.  The 
band, however,  also  Indicates  a  region  in  which  there  is  no  data,  and 
might  with  more  data  be  found  to  be  assignable  to  the  two  sets  in  some 
manner. 


The  problem  to  be  solved  is  therefore  to  find  p,  d1  and  d2  such  that 


X-|  d, 

x2  p  4  d2 
d2  -  d1  ^  0 


For  the  case  of  interpenetrating  sets  d2  -  d1  cannot  be  less  than 
some  positive  quantity;  It  Is  desirable  to  minimize  d2  -  dj. 

The  above  problem  for  Interpenetrating  sets  can  be  written  in  matrix 
form  as  maximizing  d1  -  d2  subject  to 
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The  solution  is  given  by  q  -  Z+t  provided  that 


t  ^  0 
and 

<ZZ+  -  I)  t  -  0 

The  latter  equation  reduces  as  before  to 
I,  UT  t  -  0 

where  U  is  part  of  the  singular  value  decomposition  of  Z,  that  is  Z  - 
UQVT  and  ^  is  a  diagonal  matrix  whose  ii  entry  is  0  if  q^  f  0 
and  +1  if  q^  -  0. 

If  there  are  n  rows  in  and  X2  combined,  and  they  have  m  columns 
each,  then  U  is  an  (n  +  1)  x  (n  +  1)  matrix  and  0  is  (n  +  1)  x  (m  +  2). 

If  r  Is  the  rank  of  0  then  I.  has  only  n  +  1  -  r  non  zero  rows.  Thus 

T  1 

1^  U  can  be  written  as  an  (n  +  1  -  r)  x  (n  +  1 )  matrix. 

The  problem  is  now  in  standard  linear  programming  form:  find  t  such  that 

t  ^  0 

I1  UT  t  -  0 

maximize  -t„  , 

n  +  1 
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If  the  constraints  however  are  satisfied  by  a  vector  $,  then  they  are 
also  satisfied  by  any  positive  multiple  of  t.  The  objective  function 
would  drive  that  multiple  to  0. 

This  unsatisfactory  situation  also  can  be  seen  in  the  problem  as 
originally  stated,  by  letting  p  -  0,  d1  «  0,  d2  -  0.  Clearly  it 
arises  because  no  constraint  was  imposed  on  the  length  of  p,  such  as  the 
classical  constraint  that  llpll  -  1.  Such  a  constraint  would  Introduce  a 
nonlinear  aspect  and  so  Is  to  be  avoided.  Instead  the  linear  constraint 

o_ 

L  t  ^l 
1-1  i 

will  be  used.  The  objective  function  should  drive  any  solution  to  the 
]Tt^  -  1  plane,  since  if  t  is  a  solution,  not  on  the  planes  then  k 
k  «  1/  Is  also  a  solution  has  smaller  t  ^  and  is  on  the 

plane. 

The  new  constraint  specifies  that  the  distances  of  each  point  from  its 
bounding  plane  add  to  at  least  one;  the  distances  being  defined  so  as  to 
be  always  positive. 

Consider  now  the  problem  when  the  sets  are  separated,  and  as  wide  a  band 
as  possible  Is  sought  between  them.  In  algebraic  terms  the  problem  is 
to  find  p,  d1 ,  d2  so  that 


X1 

-1 

0 

p 

i 

*1 

CSJ 

>< 

0 

+1 

dl 

- 

t2 

>  0 

1 

-1 

dj 

• 

*n*l  I 


and  tg  -  dj  -  d2  is  maximized. 


Converted  to  standard  linear  programming  form,  the  problem  is  to  find  t 
such  that 


F 

f 

$ 

c 


i 


t  *  o 


I,  U  t  »  0 


Maximize  t. 


Here  again,  if  the  constraints  are  satisfied  by  't,  they  are  also 
satisfied  by  any  positive  multiple  of  and  the  objective  function 
drives  that  multiple  to  infinity.  The  reason  is  that  the  problem  has 
not  been  normalized.  To  do  so  here  the  additional  constraint 


It  4  i 


is  imposed. 


The  intersection  and  the  separation  problems  can  now  be  combined  as: 


Find  p,  d1  d2  such  that 


-  0 


±1*  ^  *' 


maximize  t  t. 


or,  in  standard  form:  Find  t  such  that 


wmsmmzst 


mmm 


-)} 


U'  t  -  0 


*It  *1 


Maximize  *  t. 


where  the  upper  signs  are  for  the  intersection  problem  and  the  lower  for 
the  separation  problem.  Here  UQV^  is  the  singular  value  decomposition 
of 


Z  -  X 


Tl  +1 


and  ^  is  a  square  diagonal  matrix  whose  ii  entry  is  0  if  q^  f  0 
and  is  +1  if  q^  -  0. 

Since  the  labeling  of  sets  Is  arbitrary,  it  is  of  Interest  to  determine 
what  happens  if  the  labels  on  sets  1  and  2  were  interchanged. 

The  matrix  equation  in  geometric  variables  is 


X2  -1  01 


0  Tl  +1 


This  can  be  rewritten  successively  as 


■X2  -1  0 


0  *1  +1 


SPBW 


The  normalizing  constraint 


±  2V+1 

is  unchanged  as  is  the  objective  function. 

Thus,  the  solution  of  the  problem  for  sets  X£,  X^  is  the  negative  of 
the  solution  for  sets  X^,  X^.  Thus,  the  same  discriminating  planes 
result  no  matter  which  labeling  is  used. 


e  1 :  The  two  sets  are  sepaijated 

< 

nj-2,  n2-2,  n-4,  m-2 

3: 

X-j  ■  2  0  marked  with  o 
3  0 

X2  -  02  marked  with  x  - 

0  3 

It  is  required  to  solve 

2  0-1  0\  p1  ItA 

3  0-10  p2  t2 

2  0  11  d]  ,  t3  0 

0-301  \d2  t4 

001-1  \t5. 

Solving  the  first  four  equations  gives 


so  that 


t5  -  drd2— 3<t1+t3)+2(t2+t4) 
The  constraint 
1 »t i +t2+t^+t4 


implies  that 


-<t1+t3)-<t2+t4)-l 


so  that 


t5-3C-l+(t2+t4)]+2(t2+t4)] 

-5(t2t4)-3 


and  t5  4  0  requires  that  t2+t4  4  3/5 


Since  d^dg,  which  is  tg,  is  to  be  maximized,  it 
t2+t4  is  to  be  ma> 
optimum  choice  is 


t2+t4  is  to  be  maximized.  Since  the  t's  are  non- 


This  gives 

P1  *  t2 
P2  -t4  -  t2-1 


VV0 


W1 


dl  "  2t2 


d2  — 2t4  -  2t2-2 


and 


d i — d2  -  2 


with  t2  undetermined  in  the  range  of  0  4  t2  4  1 
shows  the  situation  for  t2-0,  1/2,  1. 
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follows  that 
negative,  the 


The  following 


T 
P  : 


cL>p  x>  d 


P  ^1),  *4-0,  d2=-2  p  p  =Ui 


dl  -  d2  -  2 


dl  -  d2  -  2 


d“d 
1  2 


Here  the  shaded  region  is  prohibited  for  either  set  of  points, 
consider  the  inequality  constraint 

1  -  VVV*4 

which  says  that 


(t1+t3)  4  -(t2+t4)+l 


The  constraint  t5  ^  0  implies  that 
<t1+t3)  4  (2/3) Ct2+t4) 

Choosing  tj+t^O  does  not  constrain  the  choice  of  t2+t4  and 
benefits  the  objective.  The  constraint  then  on  t2+t4  is  simply 


0  4  t2+t4  4  1 


Choosing  t2't4»l  benefits  the  objective  most. 

Thus,  the  Inequality  constraint  1=^  tj+t2+t3+t4  leads  to  the 
same  result  as  the  equality  constraint  l-tj+t2+t3+t4. 


e  2:  The  two  sets  are  separated 


n.|-2,  n2-2,  n+4,  m-2 


0  2 
3  0 


marked  with  o 


2  0 
0  3 


marked  with  x 


*2 


It  is  required  to  solve 


o  2  -1  01 

|pll 

H 

3  0-1  0 

p2 

‘2 

-2  0  0  1 

dl 

t3 

0-301 

d2 

1*4 

0  0-1  lj 
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Solving  the  first  four  equations  gives 


pi 

-3  3  2  -2 

'  tl 

P2 

-  1/5 

-2  2  3  -3 

t2 

dl 

-9  4  6  -6 

*3 

d. 

-6  6  9  -4 

it. 

so  that 


t5  — d1+d2-(3/5)(t1+t3)+(2/5)(t2+t4) 

The  constraint  t5«  0  is  implied  by  the  constraints  t^  ^  0,  1-1  ,,..4. 
The  constraint 


l-t1+t2+t3+t4 
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implies  that 


t5-(3/5)Cl-(t2+t4)]+(2/5)(t2+t4) 

«3/5-(l/5)(t2t4) 

Since  d.|-d2,  which  is  -tg,  is  to  be  maximized,  t2+t4  is  to  be 
maximized.  Since  the  t's  are  non-negative,  the  optimum  choice  is 

tj"tg«0,  t2+t4»l 

This  gives 

p1  -(3/5) t2  -(2/5) t4  -  t2  -  2/5 
p2  -(2/5)t2  -(3/5) t4  -  t2  -  3/5 
d1  -(4/5) t2  -(6/5) t4  -  2t2  -  6/5 
d]  ■( 6/5) t2  -(4/5) t4  -  t2  -  4/5 

and 

drd2  —2/5 

with  t2  undetermined  in  the  range  of  0  -  t2  ^  1.  The  following 
shows  the  situation  for  t,-0,  1/2,  1. 
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1(1), 


1  5’ 


>d  4 

d2="5 


=  i/2 
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-  2/5 
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The  shaded  region  is  the  overlap  region  available  to  both  sets 
Consider  now  the  inequality  constraint 

1  k  t1+t2+t3+t4 

The  objective 

-t5-  -(l/5)(t1+t3)-(2/5)(t1+t2+t3+t4) 

is  maximized  by  choosing  tj+t3  small,  as  well  as  t^+t2+t3+t4 
small;  these  are  not  in  conflict.  Thus  tj-t3-0  and  t2+t4-l  as 
was  found  before. 


A  cluster  might  be  defined  as  a  set  of  points  such  that  between  any  two 
points  x\  xB  in  the  cluster  there  is  a  sequence  of  points  in  the 

cluster:  x  ,  x  ,  x^ - x1,  x°  which  are  close  to  one  another 

in  the  sense 

llxA-x1||^.d 

llxi-xi+1u  4d,  1 -i-l-l 

!lx  I-x®ll — d 

for  some  distance  d.  Hereljxil  means  Euclidean  length  that  is 
Wxii-CSi  J  where  the  x^  are  the  components  of  x. 

Which  points  would  be  Included  in  the  cluster  would  clearly  be  dependent 
on  the  value  of  d.  For  some  situations.  In  which  a  reasonable  value  of 
d  may  not  be  clear,  It  is  therefore  desirable  to  define  a  cluster 
differently  In  a  more  intrinsic,  topological  manner,  independent  of  the 
choice  of  d.  Such  a  definition  has  a  special  significance  even  when 
suitable  d  values  may  be  estim3+ed,  since  the  clusters  thus  defined  have 
an  absolute,  intrinsic  rr 

The  new  definition  will  involve  use  of  two  metrics,  the  familiar 
Euclidean  metric 

A  B  n  A  B  2  1/2 
llx  -  x  ||-  [£  (x  -  x  )  ] 
k-1  k  k 


and  a  new  metric  which  will  be  called  a  Stepping  Stone  metric. 
A  B 

Between  points  x  and  x  ,  there  are  many  sequences 


of  points.  For  each  sequence,  the  maximum  distance  between  successive 
points  is 

dc  -  max  |jx^  -  x^+1|| 
s  Q-i-I 

where  I  depends  on  the  sequence  s.  The  quantity  d  will  be  called  the 

5a  b 

Stepping  Stone  distance  along  the  sequence  s  from  x  to  x  .  For  the 
set  S  of  all  such  sequences,  choose  the  least  of  the  d$  values 

d  -  min  d$ 
s€S 

Thus,  d  Is  the  largest  step  that  one  is  obligated  to  take  in  going  from 

xA  to  xB.  The  quantity  d  will  be  called  simply  the  Stepping  Stone 

A  B 

distance  between  points  x  and  x  . 

This  distance  satisfies  the  three  properties  of  a  metric  since  the 

B  A 

Stepping  Stone  distance  is  always  non-negative,  is  0  if  x  -  x  ,  is 
the  same  from  xA  to  xB  as  from  xB  to  xA,  and  it  satisfies  the 
triangle  inequality: 

d ( xA , xC) —  d(xA,XB)  +  d(xB,xC) 

ABC 

since  the  path  from  x  to  x  to  x  having  Stepping  Stone  distances 

d(xA,xB)  and  d(xA,xC),  respectively,  is  one  of  the  paths 
A  C 

considered  for  d(x  ,x  )  and  for  this  particular  path  the  Stepping 
Stone  distance  Is 

min  (d(xA,xB),  d(xB,xS) 

which  is  less  than  or  equal  to  the  sum. 

A  B 

The  Stepping  Stone  distance  between  x  and  x  Is  thus  the  Euclidean 
distance  between  two  points,  say  x1,  xI+1  in  a  certain  sequence 
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A  D  T 

which  goes  from  x  to  x  .  The  point  x  is  defined  as  the  frontier 
point  for  going  from  xA  to  x®  and  xI+1  is  defined  as  the  frontier 

D 

point  for  going  from  x  . 

A  cluster  Is  defined  as  a  set  of  points  which  includes  all  their 
frontier  points  and  cannot  be  decomposed  into  smaller  clusters.  A  point 
which  is  not  a  frontier  point  in  going  from  one  cluster  to  another  is 
called  an  interior  point. 

The  computation  of  clusters  according  to  these  concepts  can  best  be 
carried  out  in  terms  of  two  m  *  m  matrices  where  m  is  the  number  of  data 
points.  The  i,j  entry  for  one  matrix  Is  the  Stepping  Stone  distance 
from  point  i  to  point  j.  In  another  matrix,  the  i ,  j  entry  is  the 
frontier  point  number  for  going  from  point  i  to  point  j. 

The  matrix  of  Stepping  Stone  distances  Is  initialized  to  the  Euclidean 
distance  between  points.  The  matrix  Is  continually  swept,  replacing  at 
each  step  the  Ij  entry  by  the  maximum  of  the  i ,  k  and  k,j  entries  if 
this  produces  a  smaller  1 , j  entry.  Thus,  if  (the  biggest  step  required 
in  going  from  point  1  to  point  k)  and  (the  biggest  step  required  in 
going  from  point  k  to  point  j)  are  less  than  (the  biggest  step  found 
thus  far  to  be  required  In  going  from  1  to  j),  then  the  maximum  of  the 
former  replaces  the  latter.  The  sweeping  continues  until  no  more  change 
occurs. 

The  matrix  of  frontier  points  Is  Initialized  so  that  the  1,j  entry  Is  i 
If  below  the  diagonal  and  Is  j  If  above  the  diagonal,  the  diagonal 
entries  being  set  to  0  and  subsequently  ignored.  If  the  Ik  Stepping 
Stone  distance  replaces  the  ij  Stepping  Stone  distance,  then  the  ik 
frontier  entry  replaces  the  Ij  entry  and  the  ki  entry  replaces  the  ji 
entry.  Conversely,  If  the  kj  Stepping  Stone  entry  replaces  the  Ij 
entry,  then  the  kj  frontier  entry  replaces  the  ij  entry  and  the  jk  entry 
replaces  the  ji  entry. 

Membership  in  the  first  cluster  is  determined  by  picking  a  point 
arbitrarily,  then  including  it  and  all  its  frontier  points  in  the 


cluster,  then  all  their  frontier  points,  etc.,  and  finally  all  Interior 
points  which  have  frontier  points  In  the  cluster.  Membership  In  the 
next  cluster  Is  determined  In  the  same  way  beginning  with  any  point  not 
already  assigned  to  a  cluster. 


Certain  refinements  may  be  mentioned  but  will  not  be  discussed  In 
detail.  These  Include:  the  case  In  which  the  Ik  Stepping  Stone 
distance  equals  the  kj,  and  Is  less  than  the  1j  Stepping  Stone  distance, 
or  the  case  in  which  both  equal  the  1j  Stepping  Stone  distance;  the  case 
in  which  subclusters  exist;  the  case  in  which  bonds  are  defined  to  force 
two  points  and  their  clusters  to  be  combined  into  the  same  cluster. 


A  special  point  of  interest  is  that  this  approach  to  clusters  does  not 
require  a  prior  definition  of  smallness  by  the  user,  which  can  often  be 
a  difficult  requirement.  Instead,  the  definition  of  cluster  here  is 
more  intrinsic. 


Another  Interesting  feature  of  this  approach  is  that  two  metrics  are 
being  simultaneously  employed.  And  the  pivotally  Important  frontier 
points  have  the  property  that  the  two  metrics  coincide  on  pairs  of 
frontier  points,  which  are  thus  intrinsically  distinguished  points. 


In  establishing  certain  properties  of  clusters  generated  In  this  way, 
some  notation  will  be  useful.  Let  a^  a2,  ...  be  Interior  points 
and  Ct^Qi  2*  ...  frontier  points  of  a  cluster  A;  and  bj ,  b2>  ... 
and/^,^*  •••  the  interior  and  frontier  points  of  a  different 
cluster  B;  etc.  The  Stepping  Stone  metric  will  be  assumed  In  speaking 
of  the  distance  between  points,  or  the  closeness  of  points. 


Frontier  points  are  defined  in  pairs;  one  frontier  point  Is  said  to  be 
the  pair  of  the  other.  The  points  of  a  cluster  for  which  a  point  Ctls  a 
frontier  point  are  said  to  belong  to(X  Also,  the  set  of  points  which 
are  frontier  points  for  an  Interior  point  a,  are  said  to  belong  to  a. 

The  operator  T  applied  to  a  set  of  points  produces  all  the  points  which 
are  frontier  points  for  at  least  one  of  the  Interior  points. 


T(a>  -[alfa2,  ...  ] 
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The  Inverse  operator  T”  applies  to  frontier  points#^  ,Ct2, 
and  produces  all  points  for  which  these  are  frontier  points. 


r]m  -  ja] 


Note  that  a  frontier  point  is  a  frontier  point  for  Itself.  Also,  a 
frontier  poIntO^  in  going  from  cluster  A  to  cluster  B  may  useC^  as 
Its  frontier  In  going  from  cluster  A  to  cluster  C. 


Points  belonging  to  a  frontier  pointed are  no  further  from  one  another 
than  Of.  Is  from  Its  pair.  This  follows  since  otherwise  one  such  point 
would  be  further  fromCy,  and  from  Its  pair  than  i$(X 


We  can  show  that  the  frontier  point  for  a  cluster  which  is  most 
distant  from  its  pair  Is  a  frontier  point  for  all  points  In  its 
cluster.  Thus,  the  points  of  a  cluster  are  no  further  from  one  another 
than  this  frontier  point  Is  from  its  pair.  Hence,  there  Is  at  least  one 
frontier  point  which  can  serve  the  entire  cluster.  Such  a  frontier 
point  will  be  called  a  capital  frontier  point. 


To  prove  this  result,  suppose  the  contrary,  thatC^  cannot  serve  the 
entire  cluster.  Suppose  that  the  frontier  point  in  question,  sayC^, 
is  a  distance  d1  from  its  pair/^,  and  that  the  points  T-1^) 
which  belong  to  It  do  not  exhaust  the  cluster  but  that  there  is  another 
frontier  point#^  whose  pair/?2  Is  In  the  same  cluster  B  as Q  and 
that  T-1^)  has  no  points  in  common  with  T-1  <0^ ) .  Suppose  that 
the  distance  fromG^  to Is  d2 »  and  that  d  is  the  least  distance 
between  points  of  T-1^)  and  points  of  T-1(#2>.  By  hypothesis 


d2  4  dT 


If  d4  d]  and  d2^-  dp  then  there  are  points  of  T-1  )  which 

have  a  shorter  path  to  B  by  going  through which  Is  contrary  to 
assumption.  If  d ^  d,  and  d,  «  d,,  then#,  can  serve  as 


ajaumpuuMi  *  i  u  —  u -j  omu  "  U|  f  j  v»an  vc  a) 

frontier  for  both  T"1^)  and  T_1(#2)  which  Is  also  contrary  to 


assumption.  Hence,  d  must  exceed  d1 
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8ut  since  any  frontier  point  for  points  of  T”^Ca|)  is  no  more  than 
d1  from  its  pair,  no  points  belonging  to  them  are  more  than  d,  apart 
and  so  there  can  be  no  point  belonging  to  them  which  is  in  T~  (a,). 

i  * 

Since  the  points  of  a  cluster  are  generated  by  T  or  T  operations,  no 
such  operation  on  a  point  generated  from  a,  can  ever  produce  a  point  of 
T-1(a2).  Thus,  the  assumption  that  ^  and  <*2  are  in  the  same 
cluster  is  untenable.  Thus  can  serve  the  entire  cluster;  any  other 
frontier  point  a2  with  its  pair  in  B  is  thus  redundant.  The  other 
assertions  follow  directly. 

It  follows  further  that  if  any  point  of  the  cluster  is  chosen,  the 
operation  T(a)  will  produce  a  capital  frontier  point  and  so  T_1T(a) 
will  produce  the  entire  cluster.  Were  this  conclusion  not  established, 
the  questions  could  be  raised  whether  the  membership  in  a  cluster  would 
be  independent  of  which  point  was  selected  first  as  generator,  and  also 
whether  one  sequence  of  T,  T_1  would  generate  a  different  cluster  than 
another  sequence. 

Clusters  thus  have  the  Intuitive  property  that  the  distance  within  the 
cluster  is  less  than  the  distance  between  clusters,  where  the  latter 
distance  means  the  distance  to  the  furthest  cluster. 

Clusters  have  several  different  uses  in  the  analysis  of  large  sets  of 
data. 

One  use  Is  to  discover  types  of  points.  The  points  of  one  type  will 
differ  In  each  variable  by  a  large  amount  from  the  points  of  the  other 
type,  the  quantity  being  considerably  more  than  the  differences  within 
each  type.  This  usage  is  sometimes  called  taxonomic  since  it  discovers 
natural  groupings.  It  has  been  used  in  this  way  to  discover  species  or 
subspecies  of  animals,  and  to  discover  different  types  of  voter  or 
comsumer  requiring  different  persuasions.  This  usage  of  clusters  is  the 
one  most  often  addressed  in  the  literature. 


In  the  above  use,  clusters  would  be  sought  with  respect  to  behavior  or 
performance  or  response  variables.  Once  this  were  done,  corresponding 
clusters  might  be  sought  among  descriptive  or  predictive  or  causal 
variables.  The  underlying  Idea  would  be  that  the  former  are  effects 
while  the  latter  are  causes.  Dichotomous  differences  In  effects  suggest 
dichotomous  differences  In  causes.  This  is  the  case  If  the  cause  effect 
relation  is  continuous.  But  It  need  not  always  be  the  case  since  some 
causal  factors  can  produce  a  discontinuous  effect  relative  to  some 
threshold  value,  so  that  a  small  change  in  the  causal  variable  can 
produce  a  discontinuous  jump  in  a  response. 

The  first  sort  of  usage  does  not  apply  to  the  flutter  prediction 
problem.  Here  the  types  of  behavior  are  the  different  kinds  of 
stability  condition,  which  are  known  beforehand  for  the  Annular  Cascade 
Data  Base.  What  is  of  interest,  therefore,  are  the  corresponding 
clusters  In  the  space  of  predictive  variables.  But  here  it  is  well 
known  that  there  are  transition  regions  where  a  small  change  In  these 
variables  can  make  a  large  qualitative  change  In  the  stability 
condition.  Furthermore,  the  subset  of  the  Annular  Cascade  Data  Base 
which,  in  the  interest  of  computer  economy  was  used,  are  principally  the 
points  in  these  transition  regions.  Thus  the  points  selected  form 
clusters  which  Intentionally  contain  points  of  different  stability 
conditions. 

It  might  be  mentioned  parenthetically  that  an  example  In  jet  engine 
technology  of  the  fruitful  use  of  these  two  types  of  cluster  application 
might  arise  in  analysis  of  repair/operational  problems  of  jet  engines. 

Do  these  separate  Into  clusters?  Into  syndromes?  If  so,  these  clusters 
would  be  the  behavioral  types.  Then  we  would  ask  whether  there  were 
corresponding  clusters  In  the  space  of  engine  history/operation.  These 
would  be  the  causal  clusters.  This  correspondence  would  suggest  that  an 
engine  In  a  certain  operational  cluster  will  develop  a  certain  syndrome 
of  performance  problems. 

The  cluster  concept  also  has  a  third,  quite  different  use,  which  is  the 
use  of  greater  Importance  for  the  empirical  flutter  prediction  problem. 


Suppose  there  is  no  big  separation  between  the  predictive  variables  that 
correspond  to  one  type  of  behavior  and  the  predictive  variables  that 
correspond  to  a  different  type  of  behavior.  If  the  predictive  variables 
are  sufficient  to  predict  behavior,  there  must  be  a  surface  running 
through  them  such  that  points  on  one  side  produce  one  type  of  behavior, 
while  points  on  the  other  side  produce  a  different  type  of  behavior. 

Discovery  of  such  a  surface  Is  the  task  of  discriminant  analysis,  which 
is  easier  to  the  extent  that  the  surface  is  not  tightly  crowded  by 
points  on  each  side.  If  it  Is  crowded,  then  a  nonlinear  surface  may  be 
needed,  and  It  will  be  necessary  to  identify  where  the  surface  is 
tightly  crowded,  that  Is,  where  the  surface  must  thread  through  a 
cluster  having  points  of  two  or  more  different  behavioral  types. 

Thus,  the  third  use  of  cluster  analysis  is  to  locate  clusters  in  the 
space  of  predictive  values  for  points  of  two  or  more  behavioral  types. 

The  task  of  proposing  a  nonlinear  surface  separating  the  two  behavioral 
types  Is  thereby  simplified  since  the  clusters  can  initially  be  viewed 
as  single  "points."  The  nonlinear  surface  should  traverse  the  "points" 
of  mixed  behavioral  type  and  separate  those  of  single  behavioral  type. 

Indeed,  clusters  of  a  single  behavioral  type  may  very  possibly  be 
irrelevant  to  the  problem  of  finding  a  discriminating  plane,  provided 
there  are  enough  clusters  of  mixed  stability  type  to  define  the  planes. 

There  Is  still  a  fourth  use  of  clusters.  This  concerns  the  geometric 
shape  of  the  cluster. 

Consider  the  center  of  gravity  of  a  cluster,  that  is  the  point  whose 
coordinates  are  the  average  of  the  corresponding  coordinates  of  the 
members  of  the  clusters. 

If  the  center  of  gravity  Is  inside  the  cluster,  this  suggests  the 
cluster  is  convex;  if  outside  the  cluster,  concavity  is  indicated. 


/  A 


The  same  question  can,  of  course,  be  asked  relative  to  convex  hulls  of 
the  cluster.  And  the  answers  may  be  different.  For  example.  If  the 
points  of  the  cluster  were  distributed  over  a  half  circle,  the  answers 
would  be  different;  while  if  they  were  distributed  over  a  circle,  the 
answers  would  be  the  same. 
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GALACTIC  was  applied  to  891  test  points  selected  from  the  much  larger 
Annular  Cascade  Data  Base. 


These  points  were  selected  because  they  were  in  the  transition  regions 
between  stability  regions.  This  selection  served  to  reduce  the  total 
number  of  points  required  and  so  to  minimize  computer  storage  and  running 
time.  At  the  same  time,  this  selection  excluded  most  "easy  points," 
making  it  difficult  to  find  planes  that  separated  any  two  stability 
regions  since  the  plane  was  constrained  to  lie  between  points  close  on 
either  side. 

While  the  Annular  Cascade  Data  Base  contains  a  great  deal  of  information 
about  each  test  point,  production  engine  test  points  would  have  much  less 
information.  Accordingly,  it  would  be  impractical  to  use  more 
information  from  the  Annular  Cascade  Data  Base  than  could  be  expected 
from  production  engine  test  points.  For  this  reason,  GALACTIC  used  only 
the  following  nine  items  of  aeromechanical  data  for  each  test  point: 

Name  Meaning 

PIN  Pressure  at  inlet 

TINLET  Temperature  at  Inlet 

Slui'VA  Solidity 

REY  Reynold's  number 

INC  Leading  edge  angle  of  incidence 

MLE  Mach  number  at  leading  edge 

VLE875  Velocity  at  a  point  on  the  leading  edge 

distant  from  the  blade  root  by  87.5%  of 
the  span 

REDV1  Reduced  velocity  -  flexure 

REDV2  Reduced  velocity  -  torsion 


The  891  test  points  populated  14  stability  regions: 


Stabi 1 ity 

Number 

Code 

of  Points 

Stabi V 

00 

209 

Stall; 

stable 

01 

137 

Stall; 

flexural  flutter 

02 

56 

Stall; 

torsional  flutter 

03 

22 

Stall;  flexural  and  torsional  flutter 

04 

15 

Stall;  2nd  flexural  flutter 

05 

2 

Stall;  1  flex.,  1  tors.,  2  flex,  flutte 

06 

1 

Stall;  1  flex.,  2  flex,  flutter 

07 

31 

Stal 1 ;  separated  flow 

10 

201 

Interior,  stable 

20 

73 

Choke;  stable 

21 

126 

Choke;  flexural  flutter 

22 

13 

Choke;  torsional  flutter 

23 

4 

Choke;  flex,  and  torsional  flutter 

27 

_ I 

Choke;  separated  flow 

891 


These  may  be  combined  as: 

376  Flutter  (9  regions) 

483  Stable  (3  regions) 

J££  Separated  Flow  (2  regions) 

891 

or  as 


473 

Stall 

(8  regions) 

217 

Choke 

(5  regions) 

201 

Interior  (1  region) 

The  14  stability  regions  produce 


13  +  12  +  ...  +1  -  91 

pairs  of  stability  regions  and  therefore  require  an  equal  number  of 
hyperplanes  to  separate  them  -  or  more  correctly  -  pairs  of  hyperplane 
since  the  Relaxed  Discriminant  Method  is  used. 

The  91  hyperplanes  were  produced  over  a  23-day  period  and  required  31 
computer  runs  and  up  to  10  hours  VAX4  processor  time  per  run. 

Intersecting  Pairs  of  Stability  Regions 

For  21  of  91  pairs  of  stability  regions,  the  Discriminant  Feasibility 
Method  found  that  the  regions  intersect  one  another  (i.e.  the  convex 
hulls  intersect).  The  Relaxed  Discriminant  method  proceeds  to  find 
pairs  of  hyperplanes  which  in  all  these  cases  show  an  overlap,  as 
expected.  The  overlaps  are  generally  not  large;  their  width  may  be 
compared  to  one  another  and  to  unity  since  all  variables  had  been 
normalized  to  have  a  range  of  from  0  to  1 .  The  following  provides 
details  on  these  21  pairs  of  stability  regions:  overlap  width,  the 
number  of  points  In  each  pair  of  stability  regions,  the  percents  of 
these  points  which  are  in  the  overlap  region,  which  are  outside  the 
overlap  region  on  the  correct  or  on  the  Incorrect  side. 


Stability 

Code 

Stability 

Code 

Overlap 

Width 

1 

Correct 

t 

Incorrect 

1 

Overlao 

Number 

of  Points 

10 

20 

0.437 

4 

0.7 

95 

274 

10 

21 

0.327 

5 

0.6 

94 

327 

10 

22 

0.008 

77 

3 

20 

214 

Several  observations  can  now  be  made,  based  largely  on  the  above 

details. 

o  The  interior  region  showed  the  most  intersections  whether  with  stall 
or  with  choke  regions. 

o  The  fewest  intersecting  pairs  occurred  between  the  choke  and  stall 
regions. 

o  Even  for  the  21  pairs  of  intersecting  stability  regions,  there  were 
relatively  few  test  points  on  the  incorrect  side  of  the  hyperplanes, 
and  the  pairs  with  the  most  incorrect  points  have  especially  narrow 
overlap  regions.  Thus,  the  incorrect  points  may  have  been  only 
slightly  in  error. 

o  Most  overlaps  have  narrow  width,  suggesting  that  a  slightly 
nonlinear  surface  might  serve  to  discriminate  between  the  two 
stability  regions. 

o  Substantial  numbers  of  points  fall  into  many  of  the  overlap  zones, 
making  the  hyperplanes  in  such  cases  ambiguous  and  of  little  help  in 
discriminating  between  the  stability  regions  they  were  to  separate. 


o  The  Discriminant  Feasibility  Method  Is  virtually  foolproof  in 
declaring  that  two  regions  Intersect  since  it  constructs  a  common 
point.  The  conclusion  that  two  regions  are  disjoint,  that  is,  that 
a  common  point  could  not  be  found,  is  not  as  foolproof. 


o  The  Relaxed  Discriminant  Method  is  seen  to  work  consistently  with 
the  Discriminant  Feasibility  Method  in  that.  In  all  21  cases,  it 
produced  hyperplanes  with  overlaps. 

Disjoint  Pairs  of  Stability  Regions 

For  70  of  91  pairs  of  stability  regions  the  Discriminant  Feasiblity 
Method  found  that  the  regions  are  disjoint  (i.e.  the  convex  hulls  do 
not  Intersect).  This  conclusion  is  not,  however,  foolproof,  since  it 
is  based  on  the  negative  result  that  a  point  common  to  the  two  regions 
could  not  be  found. 

However,  in  all  but  one  case  the  conclusion  was  confirmed  by  the 
Relaxed  Discriminant  Method  finding  discriminating  pairs  of 
hyperplanes.  Usually,  the  pairs  of  planes  have  a  gap,  as  is  expected. 
But  in  some  cases  the  Iteration  counter  ran  out  on  the  gap  case,  and 
the  overlap  case  was  tried  and  resulted  in  a  slight  overlap  (say  less 
than  0.01),  or  perhaps  a  negative  overlap.  In  a  few  cases,  neither  the 
gap  nor  the  overlap  case  was  successfully  concluded  before  the 
Iteration  counter  was  exhausted  and  the  latest  hyperplane  pair  was 
accepted  with,  again,  no  more  than  a  slight  overlap. 

In  three  cases  there  was  an  Important  difference  between  the  two 
methods,  namely  for  regions  (00,20),  (00,21)  and  (10,23).  Here  ^e 
Discriminant  Feasibility  Method  declared  that  the  two  regions  are 
disjoint,  but  the  Relaxed  Discriminant  Method  was  unable  to 
successfully  conclude  either  the  gap  or  overlap  case.  When  the 
iteration  counter  ran  out,  the  hyperplane  pair  had,  respectively,  an 
overlap  of  1.31  with  22  percent  of  the  points  in  the  wrong  region,  a 
gap  of  0.002  but  39  percent  in  the  wrong  region,  and  an  overlap  of 
0.281  with  100  percent  of  the  points  in  the  overlap  zone. 


Two  possible  explanations  are  either  that  the  Discriminant  Feasibility 
Method  failing  to  construct  a  point  common  to  both  regions  declared 


disjointness  prematurely,  or  that  the  Relaxed  Discriminant  Method 
terminated  prematurely.  Which  explanation  Is  correct  has  not  been 
ascertained. 


Conclusions, 

The  above  discussion  accounts  for  all  but  a  very  few  scattered 
Instances  of  points  falling  Into  overlap  regions  and,  therefore,  not 
being  discriminated  by  the  hyperplane  pairs  -  and  these  points  may  have 
been  borderline. 

For  23  percent  of  the  pairs  of  stability  regions  (21  of  91),  the 
stability  regions  appear  to  intersect  and  to  require  a  nonlinear 
surface  to  fully  discriminate  between  the  regions. 

For  74  percent  of  the  pairs  of  stability  regions,  a  linear 
discriminating  surface,  that  is  a  hyperplane,  can  discriminate  between 
the  two  regions. 

For  three  cases,  3  percent,  It  is  not  clear  to  which  category  they 
belong. 

The  voting  procedure  in  EFAGHY  can  compensate  to  some  extent  for 
ambiguous  hyperplane  pairs,  that  Is  pairs  with  an  appreciable  overlap 
zone,  since  even  though  the  correct  stability  region  may  lack  some 
votes  due  to  ambiguous  hyperplane  pairs,  It  may  nonetheless  receive 
more  votes  than  any  other  region. 


3.0 


EFAGHY  is  an  acronym  for  "Empirical  Flutter  Analysis  by  GALACTIC 
Hyperplanes."  This  is  a  Fortran77  program  which  has  been  running  on  the 
VAX4  computer  and  Is  designed  to  apply  a  file  of  hyperplanes  produced  by 
GALACTIC  to  files  of  test  points  so  as  to  determine  In  which  stability 
region  each  test  point  is  located. 

3.1  Hvperplane  File  from  GALACTIC 

The  file,  called  FILEHO  within  EFAGHY,  was  produced  by  GALACTIC  and  has 
the  format,  for  each  hyperplane: 

Stability  code  for  1st  stability  region 
Stability  code  for  2nd  stability  region 
C'  Left  hand  side  constant  for  1st  stability  region 
C"  Left  hand  side  constant  for  2nd  stability  region 
C1  Right  hand  side  coefficient  for  1st  variable 
C2  Right  hand  side  coefficient  for  2nd  variable 


The  pair  of  parallel  hyperplanes  have  the  equations 
c'-  Ci  x1  +  c2  x2  +  - 


and 


c"  -  c1  x1  +  c2  x2  +  - 

For  a  given  test  point,  the  right  hand  side  Is  computed  for  the  values 
Xj  which  occur  at  the  test  point. 

If  c'  Is  less  than  or  equal  the  right-hand  side,  the  first  stability 
region  claims  the  point  while  If  c"  exceeds  or  equals 


101 


the  right-hand  side,  the  second  stability  region  claims  the  point. 

If  C‘  Is  less  than  C"  the  regions  overlap  and  both  regions  can  claim 
the  point. 

If  O exceeds  C"  there  Is  a  gap  between  the  planes  so  that  a  test  point 
which  falls  In  the  gap  is  not  clearly  claimed  by  either  region. 

Note  that  there  Is  a  coefficient  for  each  of  the  words  In  the  record  for 
each  test  point.  In  the  application  of  GALACTIC  to  the  Annular  Cascade 
Data  Base,  there  are  20  words  for  each  data  point.  The  same  20  word 
format  Is  used  for  test  points  employed  by  GALACTIC  to  build  the 
hyperplanes  as  well  as  test  points  to  which  EFAGHY  applies  the 
hyperplanes.  The  words  which  are  not  relevant  are  multiplied  by  a  zero 
coefficient. 


(  O 


lest  Point  Files  to  which  Hvoerplanes  are  Applied 


EFAGHY  Is  presently  dimensioned  to  accept  up  to  14  files  of  test 
points.  The  words  per  test  point  record  Is  also  Input,  but  must  be  the 
same  as  was  used  In  GALACTIC,  In  which  20  was  used. 


Input  to  EFAGHY 


In  addition  to  the  number  and  names  of  files  of  test  points  and  the 
length  of  their  records  and  the  name  of  the  hyperplane  file,  EFAGHY  asks 
the  name  for  a  file  on  which  to  write  Its  output.  If  the  response  Is 
null,  then  the  output  Is  printed  on  the  user's  terminal  device. 


Other  Input  allows  the  Input  data  points  to  be  perturbed  a  fractional 
amount  both  plus  and  minus  In  any  variable.  The  amount  may  be  different 
for  each  variable. 


It  Is  also  possible  to  confine  the  application  of  EFAGHY  to  certain 
hyperplanes,  which  must  be  named  by  the  number  In  sequence  In  which  they 
appear  on  the  hyperplane  file. 


Finally,  the  output  volume  Is  controlled  by  a  7-dlglt  number  IOUTPUT 
consisting  of  0's  and  l's.  The  0  calls  for  suppression  of  an  output, 
and  1  calls  for  display  of  output.  The  various  possible  output  can  best 
be  explained  In  the  course  of  describing  the  program  capability. 

Voting  bv  Hvoerolanes 

As  described  earlier  there  are  14  possible  stability  regions  considered 
currently  by  GALACTIC.  Between  each  pair  of  regions  GALACTIC  seeks  to 
put  a  pair  of  parallel  hyperplanes.  As  the  number  of  pairs  among  14 
things  Is  91,  there  are  91  pairs  of  hyperplanes.  Each  pair  is  intended 
to  separate  one  stability  region  from  a  second  stability  region. 

For  a  given  test  point  and  a  given  hyperplane,  the  test  point  will  be 
situated  in  one  of  the  three  domains  Into  which  the  pair  of  hyperplanes 
divides  all  of  space. 

If  the  point  Is  not  situated  between  the  pair  of  hyperplanes,  the  point 
occupies  the  same  half  space  which  contains  one  of  the  two  stability 
regions.  Thus  this  pair  of  hyperplanes  Indicates  that  the  point  has  met 
a  necessary  (not  sufficient)  condition  for  being  In  one  stability  region 
and  not  being  In  the  other.  In  this  case,  the  pair  of  hyperplanes 
applied  to  the  test  point  casts  a  yes  vote  for  one  stability  region  and 
a  no  vote  for  the  other. 


If,  on  the  other  hand,  the  test  point  Is  situated  between  the  pair  of 
hyperplanes,  the  situation  Is  ambiguous.  In  case  of  overlap,  It  Is 
possible  to  cast  a  vote  for  both,  that  Is,  to  cast  a  fractional  vote  for 
each  region  depending  on  how  deep  Into  each  region  the  point  Is,  being  1 
or  yes  for  a  stability  region  If  on  the  plane  bounding  the  half  space 
belonging  to  that  region  and  -1  or  no  If  on  the  other  plane.  Thus  the 
vote  for  each  region  varies  linearly  from  +1  to  -1. 

In  the  case  of  a  gap,  the  situation  Is  also  ambiguous.  As  with  overlap, 
It  is  possible  to  cast  a  fractional  vote  for  each  stability  region.  If 
the  point  is  virtually  at  one  plane,  a  +  1  vote  Is  cast  for  the 
stability  region  it  bounds  and  -1  for  the  other  region.  The  votes  can 
vary  linearly  across  the  gap. 

Now  each  stability  region  is  voted  on  by  13  hyperplanes.  Thus  it  can 
get  a  maximum  of  13  yes  votes.  Since  for  each  such  vote,  a  different 
one  of  the  other  13  stability  regions  receives  a  no  vote,  It  follows 
that  at  most  one  stability  region  can  receive  13  yes  votes. 

Other  stability  regions  will  receive  some  yes  votes.  But  these  should 
not  be  regarded  as  incorrect  votes,  since  a  yes  vote  simply  means  that 
the  test  point  satisfies  one  of  the  13  necessary  conditions  required  of 
points  In  a  particular  stability  region.  A  given  test  point  will 
partially  fulfill  the  requirements  for  several  different  stability 
regions,  but  it  can  fulfill  all  conditions  for  only  one  stability 
region. 

Ideally  it  would  be  sufficient  simply  to  count  the  yes  votes  to 
determine  Into  which  stability  region  the  hyperplanes  classified  a 
point.  Were  there  no  ambiguous  regions  between  the  planes,  there  would 
be  91  yes  and  91  no  votes  distributed  among  14  stability  regions,  so 
that  the  average  number  of  yes  votes  would  be  6.5  per  stability  region. 
Because  of  the  occurrence  of  ambiguous  overlap  or  gap  regions,  there  can 
be  fewer  yes  votes.  "Y"  vote  is  the  term  that  will  be  used  for  the 


simple  procedure  of  assigning  a  point  to  the  stability  region  which  has 
received  the  most  yes  votes,  that  Is  not  counting  no  votes,  overlap  or 
gap  votes. 

A  somewhat  more  complicated  voting  procedure  is  to  give  each  stability 
region  credit  for  any  overlap  or  gap  region  adjacent  to  a  half  space 
belonging  to  the  stability  region.  No  votes  are  subtracted.  This 
combined  vote  will  be  termed  a  "C"  vote. 

Another  way  of  combining  votes  is  the  weighted  vote,  or  "W"  vote.  This 
adds  the  yes  votes,  subtracts  the  no  votes  and  treats  the  overlap  or  gap 
cases  as  follows.  For  these  cases,  the  adjacent  stability  region  gets  a 
1  vote  if  the  test  point  is  on  the  plane  which  bounds  it,  but  the  vote 
declines  linearly  as  the  test  point  is  further  across  the  overlap 
region,  reaching  0  on  the  far  plane. 

A  fourth  way  is  the  balanced  vote,  or  "B"  vote.  This  is  similar  to  the 
H  vote  except  that  the  vote  goes  from  +1  to  -1  in  the  overlap  or  gap 
cases  rather  than  from  +1  to  0. 

A  fifth,  and  final  way  is  the  probabilistic  or  P  vote.  This  is  like  the 
B  vote  except  that  each  vote  is  multiplied  by  the  probability  of  an 
accurate  vote  for  the  stability  region  and  by  the  hyperplane.  This  is 
gotten  by  first  applying  all  hyperplanes  to  all  data  base  points  used  to 
generate  the  hyperplanes  and  counting  how  many  points  which  were  in  the 
stability  region  were  correctly  identified.  Since  in  some  cases  there 
were  very  few  points  in  a  stability  region,  this  observed  success  ratio 
was  replaced  by  the  lower  50  percent  confidence  bound  for  a  binomial 
distribution.  If  there  are  many  points  the  confidence  band  is  narrow, 
while  if  there  are  few  points  it  is  broad.  The  final  vote  for  a 
stability  region  is  then  normalized  by  dividing  by  the  sum  of  the 
probabilities  used  in  its  calculation. 


The  five  different  ways  of  counting  votes  were  In  fact  developed  In  the 
order  presented  above,  by  evaluation  against  the  test  points  from  the 
Annular  Cascade  Data  Base  which  had  been  used  In  producing  the 
hyperplanes. 

This  Information  Is  available  from  EFAGHY  In  considerable  detail.  If  the 
7th  digit  of  the  output  control  IOUTPUT  Is  1  rather  than  0,  then  the  user 
Is  asked  to  enter  up  to  20  test  points  for  the  following  output,  the  test 
points  being  numbered  by  the  order  In  which  they  are  encountered  by 
EFAGHY.  The  output  has  on  the  left  margin  the  hyperplane  number  (1 
through  91)  and  the  stability  code  for  the  first  stability  region;  while 
on  the  right  margin  Is  the  stability  code  for  the  second  stability 
region.  Between  these  are  5  columns.  In  the  second  and  fourth  columns 
are  the  lesser  and  the  greater  of  the  constants  for  the  two  stability 
regions.  Bracketing  these  two  constants  are  parentheses,  like  (  ),  In 

case  the  stability  regions  overlap,  or  parentheses,  like)  (In  case  there 
Is  a  gap  between  the  two  stability  regions.  The  actual  right-hand  side 
value  for  the  test  point  In  question  and  the  hyperplane  Is  shown  In  the 
first,  third  or  fifth  columns  depending  on  whether  it  is  less  than  the  two 
constants,  whether  It  lies  between  them  or  whether  it  exceeds  them.  If 
the  right-hand  side  Is  In  the  left,  or  first,  column  the  hyperplanes  vote 
that  the  point  Is  In  the  stability  region  listed  on  the  left  margin; 
conversely  for  the  right. 

This  highly  detailed  output  allows  the  user  to  examine  how  each  Individual 
vote  was  cast.  It  allows  us  to  examine  why  a  test  point  known  to  be  in  a 
certain  stability  region  is  not  getting  the  votes  It  deserves;  and  why 
some  other  stability  region  Is  getting  more  votes  than  It  deserves. 

Perhaps  one  or  another  hyperplane  Is  repeatedly  casting  wrong  votes.  Such 
a  hyperplane  may  need  to  be  recalculated  using  nonlinear  combination 
variables. 


The  success  for  each  hyperplane  Is  summarized  In  a  table  entitled 
"Discrimination  Success  by  Hyperplane,"  which  Is  printed  If  the  sixth 
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digit  of  I OUTPUT  is  1.  This  shows  for  each  hyperplane  and  for  both  its 
stability  regions  the  number  of  yes,  no,  overlap  and  gap  votes  for  each 
point  known  to  be  in  either  region. 


If  the  third  digit  of  the  output  control  IOUTPUT  is  1  rather  than  0  then  a 
table  is  printed  out  for  each  test  point  to  which  the  hyperplanes  are 
applied.  The  table  has  a  row  for  each  stability  region  and  shows  the 
number  of  yes,  no,  overlap  and  gap  votes  the  greatest  fractional 
penetration  into  a  gap  or  overlap  region,  as  well  as  the  count  of  C  votes, 
W  votes,  B  votes,  P  votes.  An  asterisk  marks  the  maximum  vote  of  each  of 
the  five  types. 

Summary  information  for  all  files  of  test  points  is  provided  if  the  4th 
digit  of  OUTPUT  is  1.  First  there  is  a  separate  table  entitled  "Placement 
of  Identified  Sets,"  for  each  file,  showing  for  each  of  the  five  types  of 
votes  how  many  times  the  correct  stability  region,  when  it  was  known,  was 
in  first  place  without  ties,  in  the  first  place  with  ties,  in  the  2nd,  3rd 
place.  Another  table  entitled  "Discrimination  Success  by  Set,  Using  B 
Votes"  shows  for  each  stability  region  the  number  of  test  points  for  which 
the  stability  region  is  known  and  there  were  0,  1,  2, .. .stability  regions 
with  more  B  votes.  Another  table  presents  the  same  information  but  for  P 
votes.  This  information  is  summarized  still  further  in  a  tally  of  the 
number  of  test  points  for  which  the  P  and  B  votes  agree  in  giving  the  most 
votes  to  the  correct  stability  region,  when  one  does,  but  not  the  other, 
etc. 


Application  of  the  hyperplanes  to  the  test  points  from  the  Annular  Cascade 
Data  Base  from  which  it  was  constructed  revealed  that  the  P  vote  and  the  B 
vote  were  more  successful  than  the  other  three  voting  procedures  In 
Identifying  correctly  the  stability  region  in  which  the  test  points  were 
located. 
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When  both  the  B  vote  and  the  P  vote  identified  the  same  stability  region 
there  was  a  still  higher  success  rate. 

Thus  It  seemed  that  in  case  the  stability  region  was  not  known,  the 
stability  region  should  be  chosen  for  which  both  the  B  vote  and  P  vote 
agreed.  But  If  they  did  not  agree,  what  was  to  be  done?  Various 
strategies  were  evaluated.  Here  B1 ,  means  the  set  with  most  B  votes, 

B2  the  set  with  second  most  B  votes.  Similarly  for  P. 

Choose  set  with  B1  -  P1 ,  else  no  choice 
Choose  set  with  B1 

Choose  set  with  P1 

Choose  set  with  B1  -  P1 ,  else  with  B2  -  P2,  else  no  choice 

Choose  set  with  Bj  -  Pj ,  else  B2  -  P2,  else  B2  -  Pj ,  else 

B1  -  P2,  else  no  choice 

Choose  set  with  B1  -  P1 .  else  B2  -  P1 ,  else  B1  -  P2,  else 
B2  -  P2,  else  P1 

Choose  set  with  B1  -  P1 .  else  B1  -  P2,  else  B2  -  P1 ,  else 
B2  -  P2  else  P1 

Choose  set  with  B^  -  P1 ,  else  B1  -  P2,  else  no  choice 

Choose  set  with  B^  -  P^ ,  else  B1  -  P2,  else  B1 . 

For  each  strategy  is  supplied  a  table  showing  the  number  of  test  points 
for  each  chosen  stability  region  and  for  each  actual  stability  region. 

For  perfect  success,  all  off-diagonal  entries  would  be  zero.  In  addition, 
the  probability  of  success  Is  shown  for  each  chosen  stability  region,  and 
for  all  chosen  stability  regions.  This  is  computed  from  the  observed 
success  rate  converted  by  binomial  theory  to  a  50  percent  lower  confidence 
level.  The  latter  recognizes  the  uncertainty  arising  from  small  samples. 
This  probability  information  by  stability  region  is  intended  to  detect 
whether  some  stability  regions,  when  chosen  by  a  certain  strategy,  deserve 
more  confidence  than  others. 

This  information  is  presented  if  the  sixth  digit  of  IOUTPUT  is  1. 


As  mentioned  earlier,  one  of  the  input  quantities  is  a  seven-digit  word, 
IOUTPUT,  whose  digits  are  0  or  1 ,  where  0  indicates  supression  of  output 
and  1  indicates  display  of  output. 
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As  a  memory  aid,  EFAGHY  reminds  the  user  of  the  function  of  each  digit  by 
use  of  a  three -letter  code,  for  which  the  details  have  been  supplied  in 
the  foregoing  text. 

The  seven  codes  and  their  abbreviated  meaning  are  as  follows: 


Equations  of  the  hyperplanes 

Record  (20  words)  for  each  data  point 

Y,  C,  N,  B,  P  votes  for  each  stability  region 

for  each  test  point 

Assessment  of  voting  success  per  stability 
region 

Assessment  of  voting  success  per  point 
Assessment  of  voting  success  per  combined  vote 
strategy 

Voting  per  hyperplane  for  selected  points 


w 


4.0 


EFAGHY  was  first  applied  to  the  test  points  from  the  Annular  Cascade  Data 
Base  which  were  used  by  GALACTIC  to  generate  the  91  hyperplanes.  This 
was  done  because  the  stability  region  to  which  each  of  these  points 
belonged  was  known.  This  would  serve  to  calibrate  the  accuracy  of 
EFAGHY,  in  three  respects  before  it  was  applied  to  the  Validation 
Points . 

First  it  would  show  to  what  extent  the  voting  by  several  hyperplanes 
could  compensate  for  the  fact  that  some  hyperplanes  were  quite 
unsuccessful  in  discriminating  between  its  two  stability  regions, 
classifying  most  points  in  its  two  stability  regions  into  the  amibguous 
overlap  category.  (As  noted  elsewhere,  this  deficiency  in  some 
hyperplanes  may  perhaps  be  remedied  by  inclusion  of  nonlinear  combination 
variables) . 

Second  it  would  allow  for  the  evaluation  of  the  different  ways  of 
counting  votes  and  of  the  mixed  voting  strategies. 

Thirdly  the  experience  with  the  Data  Base  would  provide  a  standard  of 
success  against  which  to  appraise  the  application  of  the  hyperplanes  to 
the  validation  points,  whose  stability  region  was  not  disclosed 
beforehand. 


4.1  Detailed  Examination  of  26  Points  from  File  55777B 

This  was  the  first  file  of  test  points  from  the  Annular  Cascade  Data 
Base,  for  which  the  stability  region  was  known,  and  to  which  the 
hyperplanes  were  applied  by  EFAGHY.  This  file  was  chosen  simply  because 
it  was  the  smallest. 
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Using  tha  Y,  C,  W  votes  initially  and  later  the  B  and  P  votes,  we  found 
that  -:bout  77  percent  of  these  test  points  were  assigned  by  the  hyperplanes 
to  the  stability  region  to  which  GALACTIC  had  been  told  they  belonged. 

These  points  Included  12  stable  points,  2  stall  flex  flutter  and  6  choke 
flex  flutter  points. 

Upon  examination  of  the  aeromechanical  data  for  each  of  the  remaining  six 
points,  we  found  that  the  stability  region  assigned  by  EFAGHY  was  actually 
even  more  correct  in  a  practical  sense.  In  the  case  of  3  test  points  the 
listed  stability  condition  coexisted  with  the  stability  condition  selected 
by  EFAGHY  and  was  given  the  second  highest  number  of  votes. 

In  the  case  of  the  other  3  test  points,  the  stability  region  with  the  most 
votes  prevailed  very  near  to  the  test  point  and  in  the  two  cases  the  listed 
stability  condition  had  the  second  highest  number  of  votes,  while  in  one 
case  it  received  9.2  the  fourth  highest  number  of  votes  with  the  regions 
with  more  votes:  12.2,  11.4,  11.0  all  prevailing  near  the  test  point. 

Thus  from  the  point  of  view  of  practical  use  it  may  be  said  that  100 
percent  rather  than  77  percent  of  the  votes  were  correct. 

It  may  however  be  asked,  how  It  can  happen  that  the  hyperplanes  do  not 
score  perfectly  on  reproducing  the  listed  stability  conditions  since  this 
information  was  available  to  GALACTIC  when  it  was  constructing  the 
hyperplanes. 

One  reason  Is  that  for  some  pairs  of  stability  regions,  It  Is  not  possible 
to  separate  one  region  from  the  other  by  use  of  a  (multi-  dimensional) 
plane.  In  such  cases,  It  may  be  that  separation  would  be  possible  using 
nonlinear  combinations  of  the  given  Input  variables.  It  may  be  of  course 
that  other  variables  beyond  those  currently  used  may  be  needed. 


In  such  cases  GALACTIC  win  necessarily  produce  discriminating  planes  with 
an  overlap  region  between  them  and  containing  test  points  which  the 
hyperplanes  cannot  assign  to  either  region.  For  about  21  hyperplanes,  most 
of  the  test  points  which  they  are  intended  to  separate,  fall  into  the 
ambiguous  overlap  region.  Such  planes  may  have  poor  reliability  also  with 
respect  to  orientation. 

When  used  by  EFAGHY,  such  hyperplanes  can  be  doubly  misleading  since  not 
only  do  the  points  in  either  of  the  two  stability  regions  receive  at  best  a 
weak  overlap  vote,  but  points  which  actually  are  not  in  either  of  the  two 
regions  may  receive  a  yes  vote  due  to  the  unreliability  of  such  planes. 
While  the  P  vote  was  introduced  to  limit  the  impact  of  such  hyperplanes,  it 
cannot  compensate  for  the  advantage  to  a  stability  region  which  is  runner 
up  not  to  be  burdened  with  weak,  unreliable  hyperplanes. 

Besides  this  reason,  for  some  hyperplanes  there  are  test  points  in  the 
stability  regions  to  be  separated,  which  are  placed  not  i.n  the  ambiguous 
overlap  region  but  actually  in  the  region  belonging  to  the  incorrect 
stability  condition.  This  has  occurred  in  a  relatively  few  cases.  The 
test  point  data  have  undergone  considerable  transformation  before  it 
emerges  as  an  equation  of  a  hyperplane  pair  and  thus  some  points  may 
violate  the  hyperplanes  which  they  generated.  In  recognition  of  this, 
GALACTIC  adjusts  the  hyperplane  position,  not  the  orientation  so  as  to 
eliminate  borderline  violations,  but  limits  the  amount  of  adjustment.  We 
need  to  verify  the  cause  of  such  violations. 


For  these  points,  51  percent  were  assigned  by  EFAGHY  with  the  listed 
stability  condition.  But  using  the  detailed  aeromechanical  data  for  each 
of  the  missed  points,  we  found  that  the  stability  condition  chosen  by  EFAGHY 
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coexisted  at  6  points  and  that  for  4  others  the  chosen  stability  region 
was  close.  One  point  was  off  the  stability  map  and  Is  regarded  as 
somewhat  anomalous;  accordingly  It  was  simply  excluded. 


This  results  in  39  hits  In  56  points  or  70  percent  success  rate. 


For  the  remaining  17  points,  only  3  are  regarded  as  bad  misses  since 
EFAGHY  gave  the  correct  answer  the  third  or  fourth  highest  number  of 
votes.  Thus  no  hint  of  the  correct  answer  was  provided  in  5  percent  of 


the  cases. 


The  detailed  examination  of  85  data  points  as  described  above  resulted  in 
the  fine  tuning  of  the  type  of  voting  and  of  mixed  voting  strategies 
which  would  produce  the  greatest  number  of  correct  choices  of  stability 
region  by  EFAGHY. 


The  best  method  seems  to  be  the  P  vote,  that  Is  to  choose  P1 ,  the 
stability  region  with  the  most  P  votes.  Nearly  as  good  is  to  choose 
B1 ,  the  stability  region  with  the  most  B  votes.  When  these  two  agree, 
there  Is  still  greater  likelihood  of  a  correct  answer.  However,  the  P2 
and  B2  stability  regions  should  also  be  considered  since  when  the  P1 
or  B1  choices  are  incorrect.  It  Is  very  often  the  case  that  the 
correct,  listed  stability  region  was  second  choice.  This  was  the  case  In 
the  preceedlng  detailed  examination  of  85  data  points,  where  the  listed 
region  was  sometimes  second  choice  with  the  most  votes  going  to  a 
coexisting  region  or  a  very  near  region. 


For  the  891  data  points  from  the  Annular  Cascade  Data  Base  which  were 
used  by  GALACTIC  to  generate  the  hyperplanes  and  which  Included  the  85 
data  points,  It  was  not  possible  to  carry  out  the  detailed  examination  as 
was  done  on  the  85  data  points.  Thus  the  only  question  that  could  be 
asked  Is  whether  the  listed  stability  condition  was  chosen. 


For  84  percent  of  the  891  data  points  the  listed  stability  condition  was 
among  B] ,  B2,  Pj ,  P 2- 

For  74  percent  of  the  points,  B1  and  P1  agreed  and  when  this  occurred 
87  percent  of  the  time  the  choice  was  correct.  If  it  is  acceptable  to 
refrain  from  a  choice  when  B1  and  P1  do  not  agree,  a  success  rate  of 
87  percent  is  therefore  indicated. 

However  if  it  is  always  required  to  make  a  choice,  the  best  strategy 
seems  to  be  P1 ,  which  has  a  59  percent  success  rate.  The  next  best 
strategy  seems  to  be  B1 ,  which  has  a  success  rate  of  56  percent. 

Detailed  examination  of  the  26  points,  from  File  55777B  resulted  in 
raising  the  77  percent  success  rate  for  literal  correctness  to  100 
percent  for  practical  correctness,  a  jump  of  23  percent.  For  the  57 
points  of  File  C2BDY  raised  the  51  percent  success  rate  for  literal 
correctness  to  70  percent  for  practical  correctness.  Combined,  the 
success  rate  for  literal  correctness  was  59  percent,  whereas  detailed 
examination  raised  the  success  rate  for  practical  correctness  to  79 
percent  a  jump  of  20  points. 

For  the  891  test  points,  P1  choice  also  gave  a  success  rate  of  exactly 
59  percent  for  literal  correctness.  Were  a  detailed  examination  made  of 
the  aeromechanical  situation  of  the  missed  points.it  would  therefore  not 
be  surprising  if  a  success  rate  of  79  percent  were  achieved  for  practical 
correctness. 

In  view  of  the  fact  that  only  about  74  percent  of  the  hyperplane  pairs 
are  fully  adequate,  it  would  not  be  surprising  if  only  79  percent  of  the 
test  points  were  correctly  classified.  If  non-linear  terms  could  raise 
to  near  100  percent  the  number  of  discriminating  surface  pairs,  then  a 
comparable  improvement  in  the  success  rate  is  anticipated. 
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The  technical  proposal  for  this  contract  specified  that  the  Validation 
test  points  would  include  data  from  fan  and  compressor  stages  of  advanced 
military  engines,  commercial  engines,  and  development  engines  and  that 
the  stability  regions  would  Include  the  stall  and  choke  flutter 
boundaries,  the  stable  operating  regions  and  the  stall  and  choke 
regions.  The  number  of  Validation  Points  was  not  specified. 


A  total  of  51  test  points  were  selected.  These  were  distributed  among 
seven  sets  of  test  data,  as  follows 


Aeromechanlcal  Compressor/ 

High  Aspect  Ratio 
Development  Fighter  Engine  Case  1 
Fan  Rotor  1st  Stage 
Development  Fighter  Engine  Case  2 
Fan  Rotor  2nd  Stage 
Development  Fighter  Engine  Case  3 
Compressor  Rotor  Stage  4 
High  Bypass  Turbofan  Engine 
20"  Simulator 
Rotating  Rig  Test 


High  Bypass  Turbofan  Engine 
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The  stability  codes  at  the  head  of  the  columns  have  the  meaning: 


m 

i 


I 


00  Stall 
01  Stall 
02  Stall 
03  Stall 
07  Stall 


stable 

flexural  flutter 
torsional  flutter 
flexural  and  torsional  flutter 
separated  flow 


10  Interior;  stable 

We  had  Intended  to  Include  some  choke  points  from  J85  data.  However,  we 
found  on  closer  examination  of  the  data,  that  choke  and  stall  occurs 
simultaneously,  the  former  at  mid  span  and  the  latter  at  tip.  Thus  clear 
Validation  Points  are  not  available  for  choke.  Accordingly  no  Validation 
Points  of  choke  type  were  included. 


In  contrast  to  the  Annular  Cascade  Data  Base,  for  which  the  test  points 
were  richly  Instrumented  and  measured  with  special  care,  the  Validation 
Points  come  from  a  non-research  environment  for  which  less  Information 
was  available  and  what  Information  was  available  had  less  precision. 

The  aeromechanlcal  information  which  was  generally  available  for 
production  engines,  to  which  this  contract  work  Is  addressed  consists  of 
the  following: 


2  PIN  Pressure  at  Inlet 

3  TINLET  Temperature  at  Inlet 

6  SLDTYQ  Solidity 

9  REY  Reynold's  number 

10  INC  Leading  edge  angle  of  incidence 

11  MLE  Mach  number  at  leading  edge 

14  VLE875  Velocity  at  a  point  on  the  leading 

edge  distant  from  the  blade  root 
by  87. 5X  of  the  span 

15  REDV1  Reduced  velocity  -  flexure 

16  REDV2  Reduced  velocity  -  torsion 

The  word  number  denotes  where  the  data  occurs  In  the  20-word  record  used 
for  each  test  point;  the  same  format  being  used  both  for  points  from  the 
Annular  Cascade  Data  Base  as  well  as  for  the  Validation  Points. 

In  anticipation  of  the  limited  amount  of  Information  available  for  the 
Validation  Points,  It  was  necessary  for  GALACTIC  to  use  no  more  than  that 
Information  from  the  much  richer  Annular  Cascade  Data  Base  in  generating 
hyperplanes.  Otherwise  the  hyperplanes,  though  doubtlessly  more  correct, 
would  have  been  inapplicable  to  real  world  engine  data.  Indeed  this 
restriction  on  the  aeromechanlcal  data  that  could  be  used  might  be  one  of 
the  causes  of  some  hyperplane  pairs  with  heavily  populated  overlap 
regions  wherein  discrimination  between  the  two  stability  regions  was  not 
possible. 


Once  GALACTIC  had  produced  91  pairs  of  hyperplanes,  these  were  applied  to 
the  891  test  points  from  which  the  hyperplanes  had  been  constructed, 
thereby  developing  the  voting  system  contained  In  EFAGHY.  We  found  that 
59  percent  of  the  test  points  were  correctly  Identified  In  a  narrow 
literal  sense  by  the  hyperplane/voting  system,  and  there  Is  reason  to 
believe  that  79  percent  of  the  test  points  were  correctly  Identified  in  a 
broader  practical  sense.  This  is  not  surprising  since  only  74  percent 
(67  of  91)  of  the  hyperplanes  were  fully  adequate,  some  21  of  the 
stability  region  pairs  showing  Intersections  which  might  be  avoided  by 
curved  discriminating  surfaces  and  3  requiring  improvement. 

We  further  found  that  while  the  best  two  voting  approaches  were  to  choose 
P1  or  B.|  (the  top  vote  getters  for  P  votes  or  B  votes),  giving  the 
mentioned  59  percent  literal  accuracy,  if  both  approaches  agreed  (P1  - 
B^ )  then  there  was  an  87  percent  probability  of  the  choice  being 
correct  In  a  literal  sense. 

Finally  we  found  it  was  84  percent  certain  that  the  correct  stability 
region  would  be  P1  the  region  with  most  P  votes  or  P2  the  runner  up, 
or  B1  the  region  with  most  B  votes,  or  B2  the  runner  up. 

EFAGHY  was  then  applied  to  the  51  Validation  Points.  The  results  will  be 
compared  with  those  achieved  when  EFAGHY  was  applied  to  the  891  test 
points  from  the  Annular  Cascade  Data  Base. 

The  results  can  best  be  presented  In  two  tables.  The  first  table  shows 
for  each  test  point  the  codes  chosen  as  Bp  82,  P]  or  P2  by 
EFAGHY  together  with  the  number  of  votes  each  code  received.  The 
greatest  number  of  votes  Is  marked  with  a  prime  (')  and  the  second 
greatest  Is  marked  with  a  double  prime  ('').  Columns  are  marked  with  an 
X  to  signal  that  the  stability  condition  that  actually  prevailed  was 
among  Bp  B2>  ?!  or  P2,  that  It  was  B! ,  that  it  was  ?! ,  that 
B!  and  ?!  agreed  Irrespective  of  whether  they  were  correct. 
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For  the  Validation  Points  B1  was  correct  in  a  literal  sense  for  only  5 
points,  10  percent,  and  P1  for  only  4  points,  8  percent.  Furthermore 
the  stability  regions  chosen  as  B]  or  P1  for  the  other  points  have 
been  reviewed  and  found  were  also  not  correct  in  any  practical  sense,  that 
is  the  Validation  Point  is  not  in  or  near  a  transition  to  the  regions 
chosen  as  B^  or  P1 .  These  percents  contrast  to  59  percent  for  EFAGHY 
applied  to  the  Annular  Cascade  Data  Base. 

For  67  percent  of  the  points  (34  of  51)  the  B1  and  P1  choices  agreed, 
but  only  9  percent  (3  of  34)  were  correct.  Thus  the  agreement  of  81  and 
P1  is  no  indicator  of  a  correct  answer.  For  the  Data  Base  B1  and  P1 
agreed  for  74  percent  of  the  points  and  when  this  occurred  their  choice 
was  correct  for  87  percent  of  the  cases. 

For  61  percent  of  the  points  (31  of  51)  the  correct  choice  was  among  Bp 
B2,  Pp  P2<  This  compares  with  84  percent  for  the  Data  Base.  The 
conclusion  from  the  above  would  seem  to  be  that  although  neither  B1  nor 
P1  were  proven  useful  as  Indicators  of  the  correct  stability  region,  the 
weaker  result  stands:  that  Bp  B2,  Pp  P2  include  the  correct 
result  for  a  fair  percentage  of  the  cases.  This  result  would  suggest  that 
the  hyperplanes  have  captured  some  of  the  reality  of  the  Validation 
Points. 

However  even  this  mildly  favorable  conclusion  seems  to  be  unjustified  as 
the  following  table  shows. 

The  following  table  shows  for  each  Validation  Point  the  Bp  B2,  Pp 
P2  choices  made  by  EFAGHY.  The  Validation  Points  having  the  same 
(correct)  stability  code  are  grouped  together,  without  identification,  but 
taken  in  the  order  used  in  the  prior  table. 


The  principal  observation  that  can  be  made  from  the  above  table  Is  that  to 
the  hyperplane/voting  system  In  EFAGHY,  the  Validation  Points  all  look 
similar  In  that  the  choice  00,  01  and  03  is  made  for  almost  every 
Validation  Point.  Since  for  691  of  the  Validation  Points,  the  correct 
choice  Is  actually  one  of  these  three,  it  is  to  be  expected  by  chance  only 
that  the  correct  choice  would  occasionally  be  made. 

In  fact,  the  response  to  certain  groups  of  Validation  Points  is  even  more 
striking.  The  previous  table,  for  example,  shows  that  for  all  six 
Validation  Points  from  the  FI 01  Derivative  Fighter  Engine,  Fan  Rotor,  1st 
Stage  the  leading  B  and  P  scores  are  all  almost  identical.  This 
similarity  holds  also  for  the  lower  scores  for  the  other  stability 
regions,  not  shown  in  the  table. 

There  are  seven  such  groupings  of  nearly  identical  votes  for  points.  In 
all  but  one  grouping,  there  is  a  diversity  of  actual  stability  condition. 
One  possible  explanation  of  the  discrepancy  between  the  hyperplane/voting 
system  in  EFAGHY  as  applied  to  the  Annular  Cascade  Data  Base  and  as 
applied  to  the  Validation  Points  is  that  measurement  of  the  nine 
aeromechanlcal  variables  In  production  engines  Is  not  as  exact  as  in  the 
Annular  Cascade.  To  verify  whether  this  was  a  factor,  as  well  as  to  check 
for  closeness  to  transition  zones,  the  Validation  Points  were  rerun  with 
18  variations,  namely  each  of  the  9  aeromechanlcal  variables  was  perturbed 
+  10  percent.  In  most  cases  the  B1 ,  B2,  P^ ,  P2  choices  by  EFAGHY 
were  the  same  as  before.  In  hardly  any  case  was  the  correct  answer 
chosen. 

Another  possible  explanation  might  be  that  the  Validation  Points  covered 
too  narrow  a  range  relative  to  the  891  test  points  from  the  Annular 
Cascade  Data  Base,  so  that  the  Validation  Points  might  all  look  alike. 
Alternatively,  discrepancies  might  be  explained  if  the  range  of  the 
Validation  Points  exceeded  that  of  the  891  points  with  many  points  having 
variables  beyond  the  range  of  the  891  points. 


The  following  table  shows  for  the  51  Validation  Points  the  maximum  and 
minimum  of  each  variable.  The  latter  are  then  expressed  relative  to  the 
corresponding  maximum  and  minimum  of  the  891  points  from  the  Annular 
Cascade  Data  Base,  transformed  to  a  scale  from  0  to  1 . 
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We  can  readily  see  that  the  range  of  the  variables  for  the  Validation  Points 
Is  not  small  relative  to  the  Data  Base,  but  generally  occupies  a  substantial 
portion  within  the  range  of  the  Data  Base  variables. 


However  Inlet  pressures  for  65  percent  (33  of  51)  of  the  Validation  Points  are 
below  the  range  of  the  891  Data  Base  points;  Reynold's  numbers  for  45  percent 
are  above  their  Data  Base  range;  solidity  for  14X  are  below  their  Data  Base 
range. 

In  fact  there  are  only  4  points  whose  data  all  lie  within  the  range  of  the  891 
Data  Base  points,  namely  points  61,  169,  176,  172  from  the  Rotating  Rig  Test. 

The  possibility  cannot  therefore  be  discounted  that  the  Validation  Points  do 
not  fit  the  hyperplanes  gotten  from  the  891  Data  Base  points  because  the 
Validation  Points  lie  partially  outside  the  range  of  the  Data  Base  points. 
However  the  possibility  loses  some  credibility  since  the  hyperplanes  do  not 
fit  the  four  Validation  Points  that  lie  within  the  range  of  the  Data  Base 
points  any  better  than  those  that  lie  partially  outside. 


As  a  result  of  the  application  of  the  hyperplanes  produced  by  GALACTIC  to 
the  Annular  Cascade  Data  Base  as  well  as  to  validation  points  not  In  the 
data  base  a  number  of  Issues  have  surfaced.  These  will  now  be  grouped 
under  eight  headings  and  discussed,  together  with  a  plan  for  the 
resolution  of  each  Issue. 


In  the  case  of  three  hyperplane  pairs  (those  that  discriminate  between 
stability  regions  00  and  21,  between  regions  00  and  20,  and  between 
regions  10  and  23)  the  first  discriminant  method  declared  that  the 
regions  were  disjoint  even  in  subspaces  of  only  6  or  7  of  the  9 
aeromechanical  variables.  But  the  Relaxed  Discriminant  procedure  for 
finding  the  discriminating  planes  was  unable  to  do  so  even  by  using  all  9 
variables  either  for  the  overlap  or  gap  case  -  at  least  within  the  2000 
allowed  iterations.  While  the  two  discriminating  methods  have  proven 
generally  consistent,  with  allowance  for  numerical  differences,  these  two 
cases  raise  the  question  of  a  possible  bug  In  the  code.  The  discrepant 
result  in  these  two  cases  might  however  be  due  to  the  Insufficient 
Iteration  or  to  numerical  difference  due  to  the  complete  diversity  of  the 
two  methods.  In  these  three  cases  the  latest  hyperplane  produced  was 
accepted  by  GALACTIC  -  which  was,  perhaps,  not  the  best  choice.  But  when 
GALACTIC  and  later  EFAGHY  tested  two  of  these  hyperplanes  against  the 
data  points  upon  which  they  were  constructed,  we  found  that  the  majority 
of  test  points  In  one  stability  region  was  misclasslfied. 

Whether  there  Is  an  error  In  the  coding  or  simply  an  inadequacy,  It 
should  be  possible  to  Improve  the  code  so  that  not  only  In  theory,  but 
also  In  practice,  a  solution  will  always  be  found  which,  at  worst,  may 
place  many  points  In  an  overlap  region.  Except  for  borderline  cases,  no 
points  in  the  data  base  should  be  misclasslfied.  It  Is  believed, 
however,  that  this  situation  has  Impacted  a  relatively  few  hyperplanes 
and  test  points,  and  that  the  effect  Is  further  diluted  In  the  voting 
process. 


While  only  a  few  hyperplanes  mlsclassify  very  many  data  points  from  which 
they  were  generated,  there  are  21  which  are  ineffective  in  that  they 
place  in  an  ambiguous  overlap  zone  a  large  number  of  the  test  points  from 
which  they  were  constructed.  This  Is  attributed  to  the  inability  of  a 
plane  to  separate  the  two  stability  regions,  a  situation  in  which  a 
curved  surface  is  needed.  Fortunately  the  existing  linear  code  in 
GALACTIC  can  be  utilized  for  this  purpose,  without  change.  If  a 
nonlinear  combination  of  the  nine  basic  aeromechanical  input  variables 
can  be  suggested  on  physical  grounds,  it  is  an  easy  matter  to  enter  this 
into  one  of  the  unused  words  in  the  20-word  record  for  each  test  point 
and  redo  certain  of  the  hyperplanes  using  10  instead  of  9  aeromechanical 
variables. 

The  new  combination  variable  might  be  a  plausible  flutter  parameter,  such 
as  researchers  have  already  explored,  or  it  could  be  a  second  term  in  a 
mathematical  expansion  of  a  basic  variable.  In  either  case,  this  Is  an 
easily  accomplished  task,  which  would  not  require  modification  of  GALACTIC 
or  EFAGHY  and  would  be  applied  to  only  the  21  hyperplanes  with  heavily 
populated  overlap  region.  It  is  however  not  clear  how  much 
experimentation  with  various  nonlinear  combinations  might  be  required 
before  finding  suitable  combination  variables. 

Note  that  GALACTIC  provides  little  direct  help  in  selecting  nonlinear 
combination  variables  If  the  above  choices  prove  Inadequate.  The  testing 
of  such  variables,  once  conceived  is  however  easy.  One  help  that 
GALACTIC  can  offer  in  choosing  the  combination  variables  is  the 
identification  of  clusters  that  contain,  in  close  proximity,  test  points 
from  more  than  one  stability  region.  It  can  provide  rotation  of  the  test 
space  so  that  the  points  can  be  viewed  most  advantageously.  This  would 
Identify  the  linear  combination  of  variables  which  are  playing  a  key  role, 
but  it  does  not  say  how  to  combine  these  variables  nonlinearly. 


If  we  find  that  after  Introduction  of  nonlinear  combination  variables, 
certain  pairs  of  stability  regions  are  still  not  separable,  It  may  be 
that  separation  requires  use  of  other  than  the  9  aeromechanlcal 
variables  currently  employed. 

One  clear  indication  of  this,  even  before  experimentation  with  nonlinear 
combination  variables,  would  be  the  occurrence  of  points  of  one 
stability  region  surrounded  by  points  of  another.  Tools  for  analysis  of 
clusters  and  convex  hulls  currently  in  GALACTIC  would  be  helpful  in  this 
regard. 

In  such  a  situation  an  additional  variable  could  perhaps  lift  the 
surrounded  point  out  of  its  alien  environment. 

Were  this  the  case,  the  practical  conclusion  would  be  that  on  production 
engines  additional  instrumentation  is  needed  to  supply  the  additional 
aeromechanical  variables,  If  flutter  prediction  is  required.  The 
implication  would  be  that  past  attempts  to  predict  flutter  empirically 
have  been  flawed  by  inadequate  information. 

Identification  of  additional  necessary  aeromechanical  variables  could  be 
a  major  benefit  from  this  effort. 
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The  891  test  points  supplied  to  GALACTIC  for  construction  of  the 
hyperplanes  were  selected  carefully  from  the  thousands  of  test  points  in 
the  Annular  Cascade  Data  Base.  The  points  were  selected  so  as  to  define 
the  transition  zones  from  one  stability  region  to  another.  If 
hyperplanes  could  be  found  to  discriminate  between  these  neighboring 
points,  It  seemed  plausible  that  they  would  also  discriminate  between 
points  remote  from  transition  zones.  Due  to  computer  storage 
limit?tions,  duration  of  runs  and  the  cost  thereof,  it  was  important  to 
construct  the  hyperplanes  using  as  few  test  points  as  possible. 

While  the  selection  logic  seems  Impeccable,  it  can  be  easily  verified 
that  the  selected  points  are  representative  namely  by  applying  the 
hyperplanes  to  the  other  points  from  the  Annular  Cascade  Data  Base,  those 
which  were  not  used  by  GALACTIC  to  construct  the  hyperplanes. 

This  can  be  readily  and  cheaply  done  by  EFAGHY  since  there  is  virtually 
no  limit  on  the  number  of  points  to  which  the  hyperplanes  can  be 
applied. 

If  we  find  that  the  hyperplanes  correctly  predict  the  stability  region  in 
which  the  new  points  are  located,  then  this  indicates  a  basic  difference 
between  the  Annular  Cascade  Data  Base  and  the  real  world  Validation 
Points. 

On  the  other  hand,  if  we  find  that  substantial  numbers  of  the  new  points 
from  the  Annular  Cascade  Data  Base  are  not  correctly  assigned,  then  this 
Indicates  that  the  data  base  selected  for  GALACTIC  was  not  adequate. 

Should  the  latter  be  the  case,  then  the  corrective  action  is  clear, 
namely  to  include  some  or  all  of  the  Incorrectly  assigned  points  among 
the  test  points  used  by  GALACTIC  and  to  call  upon  GALACTIC  to  recalculate 


the  hyperplanes  which  discriminate  the  stability  regions  to  which  these 
points  belong. 


Hopefully  the  new  hyperplanes  would  prove  more  effective  both  in 
classifying  the  test  points  of  the  Annular  Cascade  Data  Base  which  would 
still  not  have  been  used  by  GALACTIC,  as  well  as  in  classifying  the 
Validation  Points. 

The  only  difficulty  anticipated  is  that  computer  storage  and  running  time 
limitations  may  be  encountered  if  it  is  necessary  to  include  many  more 
data  points  in  those  stability  regions  which  are  already  well  populated. 

Validity  of  the  Validation  Points 

There  Is  no  question  but  that  the  data  in  the  Annular  Cascade  Data  Base 
are  more  precise  than  the  real  world  Validation  Points. 

To  determine  if  such  Inaccuracy  could  account  for  the  incorrect 
classifications  produced  by  the  hyperplanes,  the  Validation  Points  were 
perturbed  plus  and  minus  10  percent,  in  each  of  the  nine  variables 
Individually,  in  the  belief  that  this  would  include  any  possible 
Inaccuracy  In  the  data.  The  correct  stability  region  was  not  chosen  by 
EFAGHY  in  hardly  any  of  the  cases.  Indeed  the  cho.ce  of  stability  region 
did  not  change  very  frequently. 

While  the  perturbations  were  not  done  in  all  variables  simultaneously, 
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since  this  would  entail  2  cases  per  test  point,  there  Is  no  Indication 
that  the  correct  answer  would  be  chosen  more  frequently. 

An  alternate  approach  might  however  prove  more  efficient,  namely  to 
Inquire  how  distant  the  Validation  Point  is  from  the  nearest  test  point 
from  Its  stability  region  which  was  used  In  the  construction  of  the 
hyperplanes. 


To  do  this  would  require  some  small  amount  of  additional  coding.  But  the 
basic  Information  Is  already  available,  namely  the  distance  of  the  point 
from  each  of  the  hyperplanes  Intended  to  bound  the  stability  region  to 
which  the  point  In  fact  belongs. 

.6  Annular  Cascade  Data  Base  Applicability 

The  ultimate  Issue  Is  of  course  the  applicability  of  the  Annular  Cascade 
research  vehicle  to  real  world  engine  data. 


If  the  hyperplane  system  of  GALACTIC  and  EFAGHY  works  well  on  the  test 
points  from  the  Annular  Cascade  Data  Base  -  particularly  on  test  points 
which  GALACTIC  did  not  use  -  but  poorly  on  the  Validation  Points,  then 
the  Issue  of  artificiality  of  the  Data  Base  becomes  more  compelling. 


It  would  be  possible  to  prove  such  incompatibility  if  we  can  show  that- 
Validation  Points  are  surrounded  by  Annular  Cascade  test  points  belonging 
to  a  different  stability  region. 


All  of  the  foregoing  work  described  In  this  section  can  be  expected  to 
Improve  the  validity  of  the  hyperplanes.  Thus  such  work  can  only  sharpen 
the  Issue  of  artificiality  should  the  hyperplanes  prove  ultimately  unable 
to  correctly  classify  the  Validation  test  points. 


Should  this  prove  to  be  the  case,  the  new  question  arises  as  to  a 
possible  real  world  correction.  Presumably  there  would  be  some  parameter 
which  had  one  value  for  the  Annular  Cascade  and  some  different  values  for 
the  Validation  Points.  There  would  be  of  course  no  clue  In  the  Annular 
Cascade  Data  Base  as  to  what  this  parameter  might  be  as  it  would  have 
been  common  to  all  the  test  points. 


Discovery  of  such  a  parameter  might  arise  from  engineering  reasoning, 
possibly  from  among  the  aeromechanical  data  recorded  for  the  Annular 
Cascade  but  not  for  production  engines.  On  the  other  hand,  it  might  be 


137 


devised  simply  as  an  empirical  correction  that  would  shift  the  stability 
regions. 


A  third  approach  of  course  would  be  the  incorporation  of  the  current 
Validation  Points  into  the  data  base  used  by  GALACTIC,  together  with 
Identification  of  new  test  points  with  which  to  validate  the 
hyperplanes.  It  would  be  interesting  to  see  which  pairs  of  stability 
regions  would  admit  hyperplanes  which  would  be  violated  by  neither  the 
Annular  Cascade  Data  Base,  nor  the  Validation  Points. 

Implied  Operational  Map  for  Validation  Points 

If  it  were  concluded  after  the  various  possible  improvements  were  made, 
that  the  Annular  Cascade  Data  Base  is  not  compatible  with  the  Validation 
Points,  it  would  be  possible  by  use  of  EFAGHY  to  construct  the 
operational  map  for  each  set  of  Validation  Points,  as  implied  by  the 
Annular  Cascade  Data  Base.  If  such  a  map  were  at  all  reasonable,  it 
might  be  that  a  relation  between  the  actual  and  the  implied  map  could  be 
discovered.  This  could  lead  to  a  fundamental  insight  as  to  which 
aeromechanlcal  variable  requires  correction,  or  an  empirical  correction. 

Combination  Stability  Regions 

The  stability  regions  as  presently  defined  do  not  include  global  regions 
for  example  for  stall,  but  only  the  subregions  for  the  different  sub 
types  of  stall,  such  as  flexural,  torsional  etc. 

It  has  been  suggested  that  the  hyperplanes  might  prove  more  effective  if 
they  addressed  the  question  of  separating,  for  example,  stall  from  choke, 
flutter  from  stability.  Only  after  this  basic  question  was  addressed, 
would  the  type  of  stall  be  investigated. 


Certainly  the  existing  system  can  be  used  to  discriminate  between  test 
points  classified  into  any  non-overlapping  regions.  But  this  would 
entail  of  course  an  independent  set  of  hyperplanes,  since  it  would  not 
be  possible  to  regard  both  generic  stall  and  flexural  stall  in  the  same 
analysis  as  these  regions  overlap. 

It  would  be  worthwhile  and  inexpensive  to  try  this  idea.  No  new 
programming  would  be  needed.  The  only  problem  would  be  the  large  number 
of  test  points  that  might  need  to  be  analyzed  simultaneously.  It  is  not 
clear  now  how  limiting  that  problem  might  be. 
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The  principal  conclusion  is  that  a  general  purpose  technology  has  been 
developed  and  implemented  In  checked  out  computer  code  for  analysis  of 
large  data  bases.  While  this  was  devised  for  application  to  the  Annular 
Cascade  Data  Base  for  empirical  prediction  of  flutter,  the  technology  is 
general  purpose,  with  much  broader  application. 


The  most  important  of  these  technological  developments  are 


Relaxed  Discriminant  Method  for  determining  two  parallel  hyperplanes 
that  separate  two  sets  of  points 


Discriminant  Feasibility  Method  to  determine  whether  two  sets  of 
points  are  separable  by  a  hyperplane 


Stepping  Stone  Cluster  Method  to  identify  clusters  using  the  two 
metrics:  Euclidean  and  Stepping  Stone,  without  need  to  quantify  in 
advance  what  is  meant  by  a  small  distance. 


Hyperplane  +  Voting  system  to  identify  regions  occupied  by  various 
groupings  of  like  points.  These  regions  can  then  be  used  to  predict 
the  grouping  to  which  a  new  point  would  belong. 


In  addition  there  were  several  other  developments:  the  Non-Objective 
Simplex  Method,  LI  eigenanalysi s ,  and  matrix  techniques  to  determine  the 
shape  and  size  of  groupings  of  points.  These  methods  can  be  applied  to 
other  engineering  problems,  as  well  as  related  maintenance  and  supply 
problems,  as  well  as  to  completely  different  data  bases  for  example  data 
bases  concerned  with  personnel. 
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The  above  GALACTIC/ EFAGHY  system  has  been  applied  successfully  to  891 
test  points  of  the  Annular  Cascade  Data  Base. 


For  83  of  these  test  points  whose  aeromechanical  situation  was  looked  at 
closely,  the  system  correctly  identified  the  stability  condition  of  59 
percent  of  the  points  In  a  narrow  literal  sense.  But  in  a  practical 
sense,  79  percent  of  the  points  were  correctly  identified. 

When  applied  to  the  891  test  points,  the  system  again  correctly 
identified  the  stability  condition  of  59  percent  of  the  points  in  a 
narrow  literal  sense.  Since  it  was  not  possible  to  look  closely  at  the 
aeromechanical  situation  of  all  of  these  points,  it  is  not  possible  to 
say  how  many  were  correct  in  a  broader  practical  sense.  But  since  the 
59  percent  correctness  in  a  literal  sense  agreed  for  both  sample  and 
population,  it  can  be  conjectured  that  the  79  percent  correctness  in  a 
practical  sense  might  apply  not  only  to  the  sample  but  to  the  population 
as  wel 1 . 

These  already  favorable  success  ratios  need  to  be  placed  in 
perspective.  First,  the  test  points  to  which  the  Hyperplane/Voting 
system  was  applied  are  the  same  test  points  used  in  constructing  the 
system  so  that  high  success  Is  to  be  expected.  Second,  only  76  percent 
of  the  hyperplanes  are  adequate  without  nonlinear  variables;  thus,  there 
are  various  easily  available  steps  using  the  existing  Hyperplane/Voting 
system  by  which  the  success  ratio  can  be  Increased  further. 

Hyperplane/Voti na  System  Applied  to  Validation  Points 

The  Hyperplane/Voting  System  applied  to  51  Validation  Points  taken  from 
seven  sets  of  engine  test  data,  was  however,  not  successful  in 
identifying  the  stability  condition  either  in  a  narrow  literal  sense, 
nor  in  a  broader  practical  sense. 


This  raises  questions: 


o  whether  the  Annular  Cascade  Data  Base  Is  representative  of  real 
engine  data 

o  whether  the  891  point  sample  used  to  construct  the  hyperplanes  was 
representative  of  the  full  Annular  Cascade  Data  Base. 


Specific  steps  are  proposed  for  resolving  these  questions,  and  for  using 
the  insights  thus  obtained  to  improve  the  predictive  ability  of  the 
Hyperplane/Voting  systems  when  applied  to  real  engine  data. 
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VAX4>  TYPE  EFI1.DAT? 34 
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VAX4>  L0 


First  is  presented  the  output  that  resulted  from  the  batch 
Input  of  Appendix  F. 


Then  the  output  from  a  time  sharing  session  of  a  slightly  later 
version  of  EFA6HY.  Being  in  time  sharing,  the  Input  is  an  intearal 
part  of  the  output. 
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VAX4> 

VAX4> 

VAX4>  TYPE  EF01.L0C; 1 
♦  !  SYSLOGIN.COM 


*  !  SYSLOGIN.COM  SYstem-w 

$  ! 

*  !  This  version  is  from  the  DEO  VAX  c 

4  !  This  file  will  need  to  be  customiz 

*  !  cases.  Any  suestions  may  be  direc 

*  !  Set  ue  the  losicals  and  symbols  for  Pa 
'♦  SPATRAN4D I R  :  P ATNAMES 

*  VERIFY® 1 

*  SET  NOVERIFY 

*  EXIT 

*  ! 

♦START  s 

♦  ON  CONTROLS  THEN  GOTO  START 

♦  SET  NOON 

♦  NODE  =  F4CETSYI ("NODENAME") 

♦  SAVE*VERIFY  =  FeVERIFY(O) 


system-wide  standard 


land  fil< 


luster 
ed  for 
ted  to 
tran2 


bu i 1 d  i 
•;er  sys 
:  West 


n?  200 
terns  in 
it  ext. 


at  column 
m  o  s  t 
6-2630. 


!  cannot  control 


jut  of  this  procedure 


set  up  user  enviroment 


PROMPT  *  F*LOGICAL <"NODE«-NAM“)  +"> 
SET  PROMPT  =  DE01 > 

SET  CONTROL® (Tr  Y) 

i 


♦EXIT: 

♦  SLOG IN 

♦  ! 

♦  ! 

♦  * 

♦  ! 

♦  ! 

♦  ! 

♦  EXIT 

♦  ASSIGN 


LOGIN.COM 


User  default  losin  command  file  fo 
NEWU.COM  when  a  new  user  is  added 
Add  user  features  below  this  lire: 


r  VAX  I 
to  the 


: .  In* 
item. 


erled  tn 


♦  EXIT 

♦  ASSIGN  EFI 1 .DAT  FOROOS 

7.DCL- 1 -SUPERSEDE,  previous  value  of  F0R005 

♦  RUN  EFAGHY 

SlllSUItSS3SS3SSIS»lflSESXSSSSSs:3X3335S:: 


has  be 


EFAGHY  (EMPIRICAL  FLUTTER  ANALYSIS  BY  GALACTIC  HYPERPLANES) 
RUN  ON  05/22/87  AT  16.75  HOURS  ON  BATCH 


ENTER  NUMBER  &  NAMES  OF  DATA  FILES t WORDS/RECORD i  Y/N  TO  CLEARFILES 
1  35777B 
20  Y 

NAME  GALACTIC  HYPERPLANE  FILE 
FLPLANES 

NAME  OUTPUT  FILE  FOR  SET  ASSIGNMENTS  >'  'TO  PRINT 


ENTER  N  (LE  14)  THEN  ID  OF  N  SETS  TO  BE  EXCLUDED 

0 

ENTER  N.  THEN  N  PAIRS  OF  IDS  TO  BE  LABELED  AS  THE  FIRST 
0 

ENTER  N  THEN  THE  N  HYPERPLANES  TO  BE  USED(0  0  FOR  ALL) 

ENTER  0  OR  1  TO  NOTWRITE  OR  WRITE  THE  FOLLOWING  OUTPUT  OPTIONS 
HYPrPTSfVOT  f ASS » ASP 


PT  # 

ACT  ID 

CANID 

YVOTE 

NVOTE 

OVOTE 
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MAX7.0 

MAX’/.G 
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BVOTE 
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5 

0.9 

4 

1 

10 

1 

1 

-7 

0.2 

'  .  *; 

3 

6 

7 

0 

0 

-1 

0 . 0 

6 

1 

9 

0 

_  c; 

0.0 

7 

5 

7 

1 

0 

-  1 

0 .  £ 

10 

3 

4» 

£ 

0 

9 

0.8 

r  .  „•  *>-.  •:  .-.v 

20 

8 

2 

1 

2 

9 

0.7 

22 

7 

4 

0 

e; 

0.7 

;•  .  . 

23 

1 

10 

1 

1 

-7 

0.7 

£7 

2 

8 

6 

3 

_ 

0.0 

14 

21 

20 

10* 

1 

i 

1 

li* 

0 .  £ 

•  •  .  -i:  ■  • 

21 

21 

9 

3 

i 

0 

7 

0.5 

•  ‘ 

0 

7 

1 

5 

0 

11* 

0.9 

*  . 

1 

7 

A. 

4 

0 

9 

0.9 

?  * 

sy 

6 

3 

4 

0 

7 

0.9 

5 

e| 

2 

1 

3 

0.7 

*.  •  •  :• 

4 

i 

12 

0 

0 

ii 

0.0 

5 

£ 

7 

0 

0 

-i 

0.0 

i 

£ 

i 

9 

0 

3 

-5 

0 . 0 

7 

4 

6 

2 

1 

1 

0.1 

10 

e; 

2 

£ 

0 

9 

0.  £ 

22 

5 

4 

2 

2 

BJ 

0.9 

23 

2 

8 

1 

2 

_  O 

0.  £ 

27 

3 

8 

0 

2 

_ 

0 . 0 

15 

2! 

21 

10* 

1 

im. 

0 

1 1 

0.5 

21 

22 

10* 

1 

•*> 

0 

M 

0.9 

1 

0 

7 

4 

2 

0 

e 

0.5 

1 

4 

8 

1 

0 

_  3 

0.  £ 

A. 

£ 

3 

0 

L 

I- 

0 . 0 

3 

4 

7 

0 

> 

L. 

-i 

0 . 0 

320 

JS* 


d'iV 

KV 

p*»y 

tws 

rc»v 


m 

ft® 


EW 


i  V*A  g'ii 


I’  m  <4vljj  m  i.|  «t|  '«>.^|.>ii»  » J.«»>. » rf.»«L*i 


26  21 


PLACEMENT  OF  IDENTIFIED  SETS 

YVOTE  CVOTE 
UNIQUE  1ST  PLACE  11  12 

TIED  1ST  PLACE  9  b 

OTHER  6  8 

DISCRIMINATION  SUCCESS  BY  HYPERPLANE 
HYPERPLANE  POINTS  IN  THE  FIRST  SET 


L'VOTE  BVOTE 
1?  19 

0  0 

7  7 


POINTS  IN  THE  SECOND  SET 


#SET1 

SETS 

YVOTE 

NVOTE 

OVOTE 

GVOTE 

Y7.T 

1 

0 

6 

12 

0 

0 

0 

100. 

2 

0 

27 

12 

0 

0 

0 

100. 

6 

1 

6 

4 

0 

0 

0 

100. 

7 

1 

27 

3 

0 

0 

1 

75. 

8 

21 

6 

9 

0 

0 

0 

100. 

9 

21 

27 

9 

0 

0 

0 

too. 

26 

2 

3 

0 

1 

0 

0 

0. 

27 

2 

4 

1 

0 

0 

0 

100. 

28 

2 

6 

1 

0 

0 

0 

100. 

29 

2 

27 

1 

0 

0 

0 

100. 

33 

21 

2 

9 

0 

0 

0 

100. 

36 

21 

3 

8 

1 

0 

0 

89. 

37 

21 

4 

9 

0 

0 

0 

100. 

42 

21 

7 

9 

0 

0 

0 

100. 

43 

21 

22 

3 

1 

3 

0 

56. 

44 

21 

3 

9 

0 

0 

0 

100. 

43 

21 

23 

© 

0 

0 

0 

100. 

46 

1 

7 

6 

1 

3 

0 

0. 

47 

1 

22 

4 

0 

0 

0 

100. 

48 

1 

3 

4 

0 

0 

0 

100. 

49 

1 

23 

4 

0 

0 

0 

100. 

34 

0 

7 

12 

0 

0 

0 

!  . 

33 

0 

22 

12 

0 

0 

0 

lOO. 

36 

0 

3 

12 

0 

0 

0 

100. 

37 

0 

23 

12 

0 

0 

0 

100. 

38 

2 

20 

1 

0 

0 

0 

100. 

0  100. 


23 

0 

10 

1 

2 

-7 

0.6 

0.8 

1.1  - 

10.9 

m 

27 

3 

5 

0 

3 

3 

0.0 

0.4 

7  1 5 

1  .9 

0 

11* 

0 

2 

0 

13* 

0.8 

0.0 

12.3* 

12.1* 

1 

"11* 

2 

0 

0 

9 

0.0 

0.0 

11.0 

9.0 

m 

2 

8 

3 

0 

0 

3 

0.0 

0.0 

8.0 

3.0 

3 

9 

3 

1 

0 

7 

0.3 

0.0 

9.3 

5.6 

K'V 

4 

2 

11 

0 

0 

-9 

0.0 

0.0 

2.0 

-9.0 

|$ 

5 

1 

10 

0 

2 

-7 

0.0 

1.0 

1 .8 

-9.3 

6 

3 

9 

0 

1 

0.0 

0.2 

3.8 

-5 . 4 

7 

8 

5 

0 

0 

3 

0.0 

0.0 

8.0 

3.0 

10 

4 

4 

5 

0 

3 

0.9 

0.0 

6.0 

-1.0 

•  j 

20 

8 

4 

1 

0 

5 

0.8 

0.0 

8.8 

4.6 

21 

4 

7 

1 

i 

- 1 

0.6 

0 . 0 

5.6 

-1.8 

22 

4 

o 

V 

1 

6 

0. 1 

0.0 

4.1 

-4.8 

$ 

23 

0 

10 

1 

2 

-7 

0.7 

0.8 

1.1  - 

10.9 

27 

9 

4 

0 

0 

5 

0.0 

0.0 

9.0 

5.0 

21 

10* 

1 

2 

0 

11 

0.7 

0.0 

11.2* 

9.4* 

0 

8 

4. 

3 

0 

9 

t.O 

0.0 

10.4 

7.8 

1 

6 

5 

2 

0 

3 

0.1 

0.0 

6 . 1 

-0.8 

Cv 

$ 

m 

2® 

Kw1 

EsSi 

K*iVi 


l 

Min' 


imwiSWUSim  vn/ummfnsnaram.rv.nfi^^j-^JVjtrMrwirjTnnru.Trxr- 


■  * 

61 

1 

2 

3 

0 

1 

0 

75. 

0 

0 

i 

0 

0 . 

4 

62 

1 

3 

4 

0 

0 

0 

100. 

63 

1 

4 

4 

0 

0 

0 

100. 

*  ■  .  *  ;  ■ 

64 

2 

10 

1.... 

0 

0 

0 

100. 

•  ■  V. •  C 

67 

0 

2 

7 

0 

5 

0 

58. 

0 

0 

1 

0 

0. 

68 

0 

3 

0 

0 

12 

0 

0. 

„t*  ■*  -  ■  • 

69 

0 

4 

7 

0 

5 

0 

58. 

70 

21 

1 

3 

1 

0 

0 

89. 

1 

0 

0 

0 

100. 

71 

21 

20 

8 

1 

0 

0 

89. 

1 

0 

0 

0 

100. 

-72 

1 

20 

4 

0 

0 

0 

100. 

73 

21 

10 

0 

0 

9 

0 

0. 

*  ■  >’•'•  : --I  ' 

75 

0 

20 

12 

0 

0 

0 

100. 

V4  ■* 

.  " •  •  '*'•  •  v  ”*■ 

76 

21 

0 

0 

o 

0 

0 

0. 

9 

0 

0 

0 

1 00 . 

.V*  • 

77 

1 

10 

4 

6 

0 

0 

100. 

■>.'>*  ;  .. 

’  ?  '  • '  '*  v 

78 

1 

0 

0 

i 

3 

0 

0. 

1 

o 

e 

o 

1 7 . 

;.**!•*-  *' 

■;  '  *:*  •*.. 

79 

0 

10 

0 

0 

12 

0 

0. 

80 

2 

7 

0 

i 

0 

0 

0. 

1 

0 

0 

0 

100. 

81 

2 

22 

1 

0 

0 

0 

100. 

*  ;  ' 

82 

2 

5 

1 

0 

0 

0 

100. 

’"UT'V  ■’  -‘"A; .  A 

83 

2 

23 

1 

0 

0 

0 

100. 

•  •  :  _  r 

TIME  US 

ED: 

8.54 

SECONDS 

f  COMPLETED 

AT  16, 

,75  HOUR 

c 

ENTER 

0»1  AS 

THERE 

ARENTi 

ARE 

MORE  CA 

SES 

■  ......  .* 

0 

FORTRAN  STOP 
*  EXIT 
CASEY 


job  terminated  at  22-MAY-1937  16:45:08.61 


Accountine  information: 

Buffered  I/O  count:  54 

Direct  I/O  count:  110 

Paee  faults:  1351 

Charsed  CPU  time:  0  00:00:07.23 

VAX4> 


Peak  Mjorkin?  set  sire:  1206 
Peak  pase  file  size:  1733 

Mounted  volumes:  0 

Elapsed  time:  0  00:00:13.89 


vf, 


m 


i| 

i 
h 

i 


m. 

a* 

W 

i'»V* 


VAX4>  RUN  EFBLBN 


EFACHY  (EMPIRICAL  FLUTTER  ANALYSIS  BY  GALACTIC  HYPERPLANES) 

RUN  ON  10/14/87  AT  14.83  HOURS  ON  TSHARE 

ENTER  NUMBER  &  NAMES  OF  DATA  F I LES r WORDS /RECORD >  Y/N  TO  CLEARFILES 
1  753777B7  20  7Y7 

XFOR-F-INPCONERR*  input  conversion  error 
unit  5  file  SYSSINPUTi.f 
user  PC  00009BA1 

XTRACE-F-TRACEBACK*  symbolic  stack  dump  follows 

module  name  routine  name  line  re!  PC 

00014CC2 
0001 4BEF 
OOOiODi? 

EFBLBNSMAIN  EFBLBN5MAIN  117  000001A1 


EFBLBNSMAIN  EFBLBN4MAIN  117  000001A1  00( 

-VAX4>  RUN  EFBLBN 

*sssarea=r*rar=rsra**ar*rrars=B=ar=*a*r*r=sr=srx3=tssssss=rrrssa=r=2SSs 

EFAGHY  (EMPIRICAL  FLUTTER  ANALYSIS  BY  GALACTIC  HYPERPLANES) 

RUN  ON  10/14/87  AT  14.64  HOURS  ON  TSHARE 


00014' 

000141 

000' 

000091 


Ml 


ENTER  NUMBER  &  NAMES  OF  DATA  FILES* WORDS /RECORD*  Y/N  TO  CLEARFILES 
1  ’35777B’  20  »Y» 

NAME  GALACTIC  HYPERPLANE  FILE 
* FLPLANES ’ 

ENTER  0*1  TO  NOTPERTURB * PERTURB  INPUT  DATA  FILES 

0 

NAME  OUTPUT  FILE  FOR  SET  ASSIGNMENTS  . '  ’  TO  PRINT 
7  7 

XFOR-F-INPCONERR*  input  conversion  error 
unit  3  file  SYSSINPUT*.? 
user  PC  00009DED 

XTRACE-F-TRACEBACK*  symbolic  stack  dump  follows 

module  name  routine  name  line  rel  PC 


EFBLBNSMAIN  EFBLBNSMAIN  i 

VAX4>  RUN  EFBLBN 

iiiss:t:sssiii3::s:ist3C3:s:x3i:s:3s:::!::s:::::r::i 


0001 4CC2 
QOOMESF 
00010D6S 
000002ED 


000 1 4C 
0001 4E 

000:  C 
00 009 C 


EFACHY  (EMPIRICAL  FLUTTER  ANALYSIS  BY  GALACTIC  HYPERPLANES) 

RUN  ON  10/14/87  AT  14.66  HOURS  ON  TSHARE 

ENTER  NUMBER  £  NAMES  OF  DATA  FILES* WORDS/RECORD*  Y/N  TO  CLEARFILES 
1  ’357776’  20  'Y’ 

NAME  GALACTIC  HYPERPLANE  FILE 


P 

n 

p 

M\ 


TO 

jm 

U 


JW  MUGUWWV.  V  w.  ajw,  iwv  uv', 


’  r  ur  L.HIVZ. o  ■ 

ENTER  Orl  TO  NOTPERTURB i PERTURS  INPUT  DATA  FILES 

0 

NAME  OUTPUT  FILE  FOR  SET  ASSIGNMENTS  >  ’  *  TO  PRINT 

it  — 

ENTER  N  <LE  14)  THEN  ID  OF  N  SETS  TO  BE  EXCLUDED 

0  0 

ENTER  N;  THEN  N  PAIRS  OF  IDS  TO  BE  LABELED  AS  THE  FIRST 
0  0 

„  ENTER  N  THEN  THE  N  HYPERPLANES  TO  BE  USEDtO  0  FOR  ALL) 

0  0 

ENTER  0  OR  1  TO  NOTWRITE  OR  WRITE  THE  FOLLOWING  OUTPUT  OPTIONS 
HYP » PTS » VOT  t  ASS » ASP i ASC > HPT 
0111111 

ENTER  N  (LE  20)  THEN  N  RECORD  NUMBERS  FOR  HYPERPLANE/FOINT  OUTPUT 

2 

1  2  3 

P  VOTE  CONFIDENCE  LEVEL  =  0.5000000 

RELIABILITY  INDEX  PER  SET 

ID  0  1  2  3  4  5  6  7  10  20  21  22  23  27 

X  0.71  0.73  0.77  0.7S  0.71  0.31  O.OS  0.77  0.59  0.79  0.73  0.67  0.57  0.17 

PT  ACT  ID  CANID  YVOT  NVOT  OVOT  GVOT  CVOT  MX7.0  MX7.C  WVOTE  EVOTE  PVOTE 

DATA  FROM  FILE  55777B 


POINT  NUMBER 
432.0000 
1.321000 
0.6650000 
1.770000 


1 

67.50000 

49.91700 

1.141000 

O.OOOOOOOE+OO 


904.3000 
0 . 6S60000 
0. 1300000 
O.OOOOuOOE-K'O 


19.50000 
10.36900 
937 . 0000 
21 .00000 


2.790000 
1 . 700000 
4 . 629000 
21.00000 


HYPERPLANE 
HYPL#  SETID 

VOTES  FOR 

POINT  *  432r 

DISTANCES  NORMAL 

RECORD  * 

TO  PLANES 

i 

SET!D  M 

1 

6 

1 

27.21380 

27.2138!!: 

61.63589 

0  ^'9 

2 

27 

r 

24.01917 

24.020013 

56.53007 

°  jjj;a 

3 

27 

] 

-1.04922  -0.85837 

-0. 40235 C 

6 

4 

6 

Z 

-0.13576 

-0.185743 

-0.11681 

10  Is 

5 

27 

t 

0.10979 

0.109803 

0.13346 

10 

6 

6 

3 

-0.14940 

-0.14234C 

0.06251 

1  §0 

7 

27 

3 

-2.28858 

-2. 28303 C 

-2.23948 

• 

8 

6 

3 

0.07854 

0.1 5722 C 

0.26433 

21  | 

9 

27 

3 

-2.95232 

-2. 94939 C 

-2.92339 

21  li 

10 

6 

3 

-1.98267  -1.94685 

-1 .92326C 

20 

11 

27 

3 

-2.03761  -1.96018 

-1.9241SC 

20  hi 

12 

22 

0.336293 

0.52278 

0.579521 

7  1 

13 

5 

3 

-1.06194 

-1 .04936C 

-1.02470 

7  jft 

14 

6 

3 

-1.28627 

-1.25879C 

-0.97135 

7  $ 

15 

27 

3 

-1.46843  -1.42749 

-1.37155r 

7  L 

$ 


m 


16 

23 

3 

-2.51894 

-2.31391 

-2.3C384£ 

7 

17 

5 

-  3 

0.03803 

0.03929£ 

0.57164 

22 

18 

6 

3 

-0.61770 

-0.50013£ 

-0.43982 

22 

19 

27 

-0.951903 

-0.93133 

-0.91507£ 

20 

23 

3 

-1.69360 

-1 .67326£ 

-1.42078 

22 

21 

6 

-0.951903 

-0.59412 

-0.58419£ 

5 

22 

27 

-0.951903 

-0.93183 

-0 .584 19£ 

c; 

23 

23 

3 

-2.6828! 

-2.49379 

-2.25033£ 

«r 

74 

i 

-1  77AS*; 

L 

L.  *f 

L.  w 

J 

im  •  1  V  • J  7  S 

4  •  /  /  OOj 

1 • w V 7L.L 

C 

25 

23 

3 

0.73634 

0 . 93 132 £ 

0.95190 

27 

26 

3 

C 

-0.21974 

-0.213103 

-0.21490 

27 

4 

-1.063493 

-0.99242 

-0.97353£ 

28 

6 

3 

-2.15086 

-2. 09786 £ 

-2.03703 

2 

29 

27 

3 

-0.4992S 

-0.42550£ 

-0.23202 

z 

30 

4 

-1.134023 

- 1 . 07920 

-1. 06043 E 

•2* 

31 

6 

3 

-1.99833 

-1.83772E 

-1.79691 

32 

27 

3 

-2.60251 

-2. 55S97C 

-2.49017 

j 

33 

6 

3 

1.10985 

1.11110 

1.1 1351  £ 

4 

34 

27 

3 

-0.61296 

-0.40740 

-0 . 36990E 

4 

35 

2 

C 

-0.39357 

-0.393573 

-0.33708 

21 

36 

3 

E 

0.13106 

0.131073 

0.15767 

21 

37 

4 

£ 

0.13857 

0.133573 

0.14462 

21 

38 

7 

£ 

0.12349 

0.128503 

0.13533 

20 

39 

22 

0.332933 

0.34155 

0.36176E 

20 

40 

5 

3 

-0. 10342 

-0.09257E 

0.40926 

20 

41 

23 

3 

-0.54755 

-0.52242 

-0. 47550 £ 

20 

42 

7 

3 

0.37558 

0. 40575 E 

0.43649 

21 

43 

22 

0. 20672£ 

0.20632 

0.209253 

21 

44 

5 

3 

0.13619 

0. 40553 £ 

0.55748 

21 

45 

23 

3 

-2.27999 

-2.25574E 

-2.02524 

21 

46 

7 

£ 

-0.02355 

-0.02090 

-0.019633 

i 

47 

22 

0.307293 

0.32760 

0.32803E 

1 

48 

5 

3 

-0.13876 

-0. 13876E 

0.28776 

1 

ys 

U»'h« 

& 


B 


i 


m 

fit 

w 

Kfi* <i 


i 

& 

& 


*Er 


49 

23 

3 

0.20178 

0.21807 

0.275751 

1 

50 

7 

...  1 

-0.23324 

-0.206133 

-0.10642 

10 

51 

22 

C 

0.10292 

0.10609 

0.110913 

10 

52 

5 

] 

-1.97558 

-1 .975561 

17.82339 

10 

-  53 

23 

c 

0.27362 

0.36437 

0.554803 

10 

54 

7 

0.790661 

0.82410 

0.927613 

0 

55 

22 

1 

21.73051 

21.730533 

50.19141 

II 

56 

5 

3 

28.22050 

28.220511 

65.47410 

o 

57 

23 

0.254313 

0.25994 

0.259991 

0 

53 

20 

-0.520303 

-0.50071 

-0.500711 

2 

59 

20 

-0.083241 

-0.05049 

-0.050493 

3 

60 

4 

0.007793 

0.01297 

0.012981 

20 

61 

2 

0.47075C 

0.51116 

0.548303 

1 

62 

3 

0.23201 C 

0.29632 

0 . 29656  3 

j 

63 

4 

1 

0.30130 

0.301303 

0.30340 

1 

64 

10 

-0.1S290C 

-0.16497 

-0.163403 

2 

65 

10 

0.149911 

0.24723 

0.247333 

: 

66 

4 

C 

0.38366 

0.48291 

0.499493 

10 

67 

2 

0.09612C 

0,15425 

0.221603 

0 

68 

3 

0. 07739 C 

0.12111 

0.258283 

0 

69 

4 

1 

0.19506 

0.199413 

0 . 20029 

o 

70 

1 

1 

-0.17130 

-0.171303 

-0.15081 

2 1 

71 

20 

C 

-0.07224 

-0.042463 

-0.03696 

2 1 

72 

20 

0.232603 

0.27950 

0.279501 

i 

73 

10 

1 

-0.43299 

-0.30725 

-0.242243 

21 

74 

10 

C 

-0.41106 

-0.22954 

-0.180583 

20 

75 

20 

1 

23.07658 

24.386333 

71.27054 

0 

76 

0 

-71.104303 

-20.16585 

-20.165621 

2 1 

77 

10 

0.437991 

0 . 496 1 4 

0.528003 

i 

73 

0 

C 

0 . 2336* 

0.23428 

0.256803 

i 

79 

10 

1 

-0.23210 

-0.03140 

0.025053 

0 

SO 

7 

t 

0.07367 

0.073673 

0.10607 

- 

81 

22 

1 

0.39447 

0.411953 

0.44982 

2 

328 
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